Looked at a data quality report lately? Chances are it's at a table or entity-level granularity.
For example, we may report that the “CUSTOMER INVOICES” table has 15% empty postal code values. We may go one stage further and say that the postal code field has a 20% failure rate because not only does it contain a percentage of empty values, there are also values that are invalid or partially completed.
Of course we can go one step further again. How about reporting on address information? We could say that an address is valid if it has a postal code and at least two additional field values; without them this mail can’t be delivered. A composite rule can be created and so our reporting granularity changes again.
This is all great, but is there a better way of reporting data quality?
Yes, there is. A more effective way is to map information to business functions and report on the quality of data at a functional level.
Here’s an example...
Perhaps you have a business task called “Accept Orders from Customers.” This is a critical function in your organisation and will obviously span multiple business units (particularly if you have different product lines, customer channels, etc.)
The “Accept Orders from Customers” function is underpinned by many business rules that in turn create and consume information. Armed with the knowledge of which applications, system functions and application data are used to enable this high-level business function, you can successfully map your data quality rules to something that is far more meaningful to the business.
You may have multiple information chains that support this function. Perhaps you have an order entry system from suppliers, customers using the web, call centre sales inquiries and even postal orders. By mapping these information chains against application functions, data sources and the high-level business function, we can report on any impacts data quality is having when accepting orders from customers.
What’s more, as you find “hotspots” of poor data quality you can start to ascertain the business impact and even the root cause because you’ve put some solid work in to understand how both the business and IT layer interact. Plus, you've got the lineage information from doing the information chain work.
Over to you - how do you report data quality metrics? What kind of granularity are you reporting at? Are you providing a range of levels to cater for multiple users?
Interested to hear how you’re making data quality relevant for different audiences.