Don't let your data warehouse be a data labyrinth!


Don't let your data warehouse be a data labyrinth!

Auditability and data quality are two of the most important demands on a data warehouse. Why? Because reliable data processes ensure the accuracy of your analytical applications and statistical reports. Using a standard data model enhances auditability and data quality of your data warehouse implementation for business analytics.

Auditability by a standard data model for insurance

Auditability is ensured when it is possible to reproduce the state of the data warehouse at any given time. This process also includes identifying incorrect data that have been loaded into the data warehouse in the past and have been used for reporting. Such transparency is required by different regulatory rules, like Solvency 2.

An auditable historization process known as bitemporal historization is therefore a fundamental demand on a standard data model. You can achieve this requirement through a strict separation between technical and business validity periods. This separation is commonly overlooked in many individual projects and often creates a problem in terms of auditability.

The following scenario describes an example of bitemporal historization: An insurance policy was loaded on July 25, 2008 into the data warehouse. Since then, it was counted as active insurance policy by all statistical reports based on the data warehouse. On August 26, 2008 the policy was canceled from the beginning. An auditable history management concept will then close the original record technically and load a new instance of the record with new technical and business validity date.

History management concept







In addition to the content of the data warehouse, all data processes have to be tracked and audited. This can be ensured by embedding a standard data model into a metadata architecture. Metadata of a different kind are stored consistently in this structure:

  • Metadata of the data warehouse data model (tables, columns, relations etc.).
  • Metadata of ETL processes, including mapping information.
  • Metadata of all business analytics processes, like. aggregation rules.

This metadata enables an automated impact analysis of source system attributes and applied business rules.

Enhancing data quality by a standard data model

Incorrect data is a problem for every insurance company. Errors have grown historically as a rule, and some have reached a level that cannot be cleansed by manual action. For example, different source systems store data of identical entities but are very difficult to be identified as identical due to erroneous data entry. This often hampers the development of a consistent party system when different party roles of the source systems cannot be consolidated: e.g. customer, intermediary, external service provider, etc.

These shortcomings of the operational source systems should not simply be copied into the data warehouse. Instead, a consistent, underlying relational model should base the data warehouse. Especially in individual development this is often not considered because it is tempting to focus too much on existing operational structures rather than to take the chance to increase the quality of data permanently through a more consistent data model and the use of data quality tools.

The standard data model for insurance in SAS implements a party model that identifies each party uniquely and enables consistent allocation of arbitrary and multiple party roles of the insurance business process. Data cleansing is supported by powerful data quality tools including master data management. The cleansing process is performed in the transform layer of the ETL process and distinguishes several stages:

  • Separation of transaction data (e.g. premiums, income) from master data (e.g. name, address, marital status, birth date) and loading of master data into a master database. The master data model will here be derived as sub-model of the standard data model for insurance.
  • Editing of the master data with a user friendly interface by data stewards, the data owner of business department – not ETL developer.
  • Support by automation functions (e.g. pattern recognition) and possible involvement of additional external information (e.g. geo-systems).
  • Merge back of validated master data into the ETL process.
  • Load cleansed data into the standard data model of the data warehouse.


Typical issues that can be solved with data quality management are:

  • Recognition of households (important for cross- and upsell campaigns).
  • Duplicate cleansing in the party system.

The standard data model provides predefined structures, e.g. for all party-relevant information, households, as well as tables for documenting duplicate cleansing (optionally as reverse input to the operational systems). All processes may be traced back any time through an auditable history management concept.

Read more on data quality and data management topics or you can learn more about the standard data model for insurance from SAS.

Hartmut Schroth, Business Advisor data strategies for insurance at SAS Germany. For further discussions, connect with me on LinkedIn. You may also read this article in German in the SAS Blog Mehr Wissen.


About Author

Hartmut Schroth

Business Advisor

After finishing his master study of mathematics in 1981 Hartmut has been working in insurance and IT business in several positions. Since 2006 he is employee of SAS Institute in Germany. As business advisor in the presales team he is responsible for DACH region (Germany, Austria and Switzerland). His main focus is advising customers of financial services industry in strategies for data models of SAS Business Analytics solutions. Moreover he is regional product manager for the SAS Insurance Analytics Architecture.

1 Comment

Leave A Reply

Back to Top