Data Governance – and the beginning was the Glossary

Karen, in our last talk we discussed the importance of a professionally organized Data-Quality-Management. Today I want to go in a little bit more detailed. Could you describe what is important implementing Data Governance processes? How should such a data governance process be designed?

For an efficient data governance process, an interplay of many components is required: a glossary to unify uniform definitions, powerful profiling, efficient rule management, workflows for error monitoring, and overlapping data lineage. Let’s start with the glossary.

With a glossary, insurers can first of all establish a consistent view of all terms and definitions. Not so easy, because each business department works with well-defined definitions, only they sometimes do not fit together. As often, the devil is in the details. Take, for example, the term “claims volume”. This could be: a) claims reported in the current year (regulated claims and reserves) or b) in the current year regulated claims, including claims reported in previous years. The amounts may differ significantly, depending on the claims experience of the different years. Incorrectly used, distorted KPIs are generated therefrom.

Understand. If, for example, you want to generate meaningful benchmark figures, you should know which value matches the period or reference size considered.

That’s exactly how it is. Only in this way correct measures can be deduced. Executives need trust, that their KPI trends are right for their business decisions.

Is this even more important in the international environment?

This is true. There are many conceptual misunderstandings in international companies. The same word may have different meanings, for example, the term “reserve.” This does not sound like a big deal, but the consequences can be serious.

Aren’t such misunderstandings detected in the course of a project?

Typically everyone expects to deliver the right numbers. If the numbers do not differ in the order of magnitude, this is hardly noticeable in the first deliveries and test runs.

Does that mean if no one critically challenges the numbers, the error is detected too late or not at all?

Exactly, I remember a case of the euro changeover. Inadvertently the number of contracts was also reduced by the DM/Euro factor. This was consistent in the annual report, but compared to the previous year’s annual report, the number of contracts was only half of it’s original value. This occurred one day before the Annual General Meeting by accident.

That is, the report was already printed and distributed?

It was like this. Of course, a correction sheet can be created quickly, but it is a bit unpleasant and embarrassing. After all, on such days, board members must justify themselves before their shareholders. And such a thing does not exactly promote trust. Sometimes it also leads to considerable project delays or, worse, unplanned additional costs to correct the error. All leaders know similar cases from their daily lives.

What is the role of the business departments?

Their inclusion is elementary. They are the masters of the numbers and know the true meaning of the data supplied. They therefore define the definitions and also provide the necessary quality rules. This must be done directly in the system, not as before in a design concept.

And now the glossary finally enters the stage!

Yes, because it supports the departments and IT in the production of a common conceptual understanding. It is important for acceptance that subject areas reflect their known terms in their language, but at the same time clearly assign them to a central definition. If the meaning of the terms is clearly defined, business departments and IT can set up meaningful rules that are then linked to the data by the IT department and used uniformly everywhere. A similar concept, by the way, as with business rules management. In order to support these processes optimally, a powerful software solution is required.

Understand, this makes the departments flexible, and they are protected by cleanly submitted transfer processes. I have another question: you stress the uniform use of the rules. Is that not happening today?

Unfortunately, no. Checking rules are usually stored several times in different programs. If a rule changes from a technical point of view, it must often be adapted in 10 or more programs. This means increased maintenance efforts, and if something is forgotten, inconsistencies in the data are possible. Today only the part of IT that is subject to action planning can change.

It is easy to imagine how flexible that is … It sounds not really efficient – but probably has grown historically. Is this the reason, that in modern solutions, rules are stored and maintained separately from programs and are defined centrally?

Yes, exactly. This applies to quality rules, as well as to business rules in general. Right now, when agile approaches count, this task division is an important prerequisite for success. Companies need to change their processes in order to be agile. A powerful data lineage analysis complements the whole. It shows graphically the links between description, rules and data. This is not to be overlooked when making changes.

So let’s have a look to the production process. What happens if errors occur during verification? Who determines what happens then?

This is defined in the project. The results of the quality checks are first stored in a repository. This is, incidentally, also important to document for monitoring for Solvency II or IFRS. This monitoring is even required by the regulators. Workflows determine the handling of the data for each fault. Is a data delivery e.g. completely rejected or is it sufficient to control the incorrect records? Who needs to be informed automatically? Who cleans the faulty data? Again, teamwork between IT and business areas is required to define the adequate workflows.

What role does the so-called profiling of the data play in this context?

Good question, Christian. Profiling shows what is in the data and what must be taken into account during the verification. What patterns or formats are there in a field? Are there any outliers? And are the key fields unique? This is an important basis for the definition of rules, which is important for all projects, even if no governance process has yet been initiated.

Do surprising things come to light?

I remember a case where the review showed that there were contracts with no premium bookings for years. Incredible, but actually happened!

Unbelievable. Is it possible to derive rules from the profiling insights directly for the automated DQ checking?

Exactly, therefore, it is so important to involve profiling from the beginning, so nothing is overlooked.

Who should apply profiling and when in the project?

Profiling is always useful because it is fast and helps to save time and money. If it is used in a preliminary examination ingood time, it protects against, among other things, over-expressed results. The preliminary investigation already shows what is possible with the available data. This is better than a bad awakening after a million-dollar investment and a year of lost time. The latter is, unfortunately, not an isolated case in practice. This is particularly true for unknown data, for example, from old legacy systems or new data sources, such as social media. However, there are also quite trivial cases, for example, when subsidiaries provide data, especially within international corporations.

That is, profiling is needed especially at the beginning of a project?

Profiling is a great help not only then, but also during operation. For example, if something changes in the delivery system. Suddenly, programs crash, or the quality rules strike more alarm. Profiling is a perfect aid to the cause analysis.

Thank you, Karen, for this trip into the everyday life of data governance. I’d say the devil is actually in the details. And only the interaction of all components actually leads to success. In addition to the organisational adjustments, powerful software is certainly required.

Absolutely, a colourful technology collection makes an efficient process more difficult. The art is to create a lean and robust process. This can only be done with good software.

This interview was originally published in German on the regional SAS blog Mehr Wissen.

Blogs