Auditability and data quality are two of the most important demands on a data warehouse. Why? Because reliable data processes ensure the accuracy of your analytical applications and statistical reports. Using a standard data model enhances auditability and data quality of your data warehouse implementation for business analytics.
Auditability by a standard data model for insurance
Auditability is ensured when it is possible to reproduce the state of the data warehouse at any given time. This process also includes identifying incorrect data that have been loaded into the data warehouse in the past and have been used for reporting. Such transparency is required by different regulatory rules, like Solvency 2.
An auditable historization process known as bitemporal historization is therefore a fundamental demand on a standard data model. You can achieve this requirement through a strict separation between technical and business validity periods. This separation is commonly overlooked in many individual projects and often creates a problem in terms of auditability.
The following scenario describes an example of bitemporal historization: An insurance policy was loaded on July 25, 2008 into the data warehouse. Since then, it was counted as active insurance policy by all statistical reports based on the data warehouse. On August 26, 2008 the policy was canceled from the beginning. An auditable history management concept will then close the original record technically and load a new instance of the record with new technical and business validity date.
In addition to the content of the data warehouse, all data processes have to be tracked and audited. This can be ensured by embedding a standard data model into a metadata architecture. Metadata of a different kind are stored consistently in this structure:
- Metadata of the data warehouse data model (tables, columns, relations etc.).
- Metadata of ETL processes, including mapping information.
- Metadata of all business analytics processes, like. aggregation rules.
This metadata enables an automated impact analysis of source system attributes and applied business rules.
Enhancing data quality by a standard data model
Incorrect data is a problem for every insurance company. Errors have grown historically as a rule, and some have reached a level that cannot be cleansed by manual action. For example, different source systems store data of identical entities but are very difficult to be identified as identical due to erroneous data entry. This often hampers the development of a consistent party system when different party roles of the source systems cannot be consolidated: e.g. customer, intermediary, external service provider, etc.
These shortcomings of the operational source systems should not simply be copied into the data warehouse. Instead, a consistent, underlying relational model should base the data warehouse. Especially in individual development this is often not considered because it is tempting to focus too much on existing operational structures rather than to take the chance to increase the quality of data permanently through a more consistent data model and the use of data quality tools.
The standard data model for insurance in SAS implements a party model that identifies each party uniquely and enables consistent allocation of arbitrary and multiple party roles of the insurance business process. Data cleansing is supported by powerful data quality tools including master data management. The cleansing process is performed in the transform layer of the ETL process and distinguishes several stages:
- Separation of transaction data (e.g. premiums, income) from master data (e.g. name, address, marital status, birth date) and loading of master data into a master database. The master data model will here be derived as sub-model of the standard data model for insurance.
- Editing of the master data with a user friendly interface by data stewards, the data owner of business department – not ETL developer.
- Support by automation functions (e.g. pattern recognition) and possible involvement of additional external information (e.g. geo-systems).
- Merge back of validated master data into the ETL process.
- Load cleansed data into the standard data model of the data warehouse.
Typical issues that can be solved with data quality management are:
- Recognition of households (important for cross- and upsell campaigns).
- Duplicate cleansing in the party system.
The standard data model provides predefined structures, e.g. for all party-relevant information, households, as well as tables for documenting duplicate cleansing (optionally as reverse input to the operational systems). All processes may be traced back any time through an auditable history management concept.
Hartmut Schroth, Business Advisor data strategies for insurance at SAS Germany. For further discussions, connect with me on LinkedIn. You may also read this article in German in the SAS Blog Mehr Wissen.