I've often said that one of the most challenging issues of a successful master data management program is enabling a wide spectrum of downstream users to effectively make use of the master data assets. In reality, many of our clients reach out to us because they're struggling to monetize their MDM investments by syndicating or publishing their master data to others in the enterprise. Their struggles are often due to difficulties in a few key areas.
Articulating data consumer needs
In many cases the creation of the master repository is the result of a set of IT processes that focused on linking disparate data sets together without considering how the resulting data set would be used. Do users want the entire data set? Do they want a set of extracted tables? Do they want to query the master index in an ad hoc manner? Do they want to be able to join the master data with their own systems’ databases?
While individuals managing their own business functions are familiar with the structure of the data their processes and systems consume and subsequently produce, they lack familiarity with a conformed master repository. This creates a barrier when trying to communicate their specific master data needs. When the IT team solicits user data requirements, the users may not be able to easily explain what they want to get from the master data asset.
Creating a master data product
What is the “product” of an MDM program? Is it a data set that's made available for use? It is a set of services for accessing a master index and corresponding master profiles? Is it a collection of extracts? How frequently are the products created? What types of consistency and coherence requirements are assigned to the production process? These are just a few of the questions that arise when considering master data consumption.
Providing a means for distribution and subsequent integration of master data
Let’s say that you settled on a specific set of extracts from the master repository as a consumable MDM product. This simplifies the production – but what are the best ways for data consumers to incorporate that product into their environments? In a number of cases this is one of the thorniest issues. Given a set of extracted tables that represent a master index along with supporting master profile data, what's the best way to integrate that extract with the consumer’s data environment? Take an image copy and load it into the consumer’s database? Or create internal tables that contain the attributes needed and then execute specialized ETL processes to load the extracted data?
Maintaining consistency
OK, let’s assume that you have some MDM extracts that are directly loaded into one consumer’s data system while providing a set of services for other users to execute ad hoc requests for master data. Because the extract is a snapshot in time, the data in the extract is going to eventually become inconsistent with the data that's continuously updated in the master data set. Not only that – but how frequently are extracts updated, and what processes are in place to ensure consistency between an extract and the master? Or, for that matter, across all the different extracts? And more to the point: once we start crating a bunch of distinct (siloed) extracts that are loaded into user databases, aren’t we sort of creating the same problem that we were trying to solve by creating a master data repository?
These may seem like disparate issues, but they are actually all directly related to the absence of best practices and patterns for master data publication and syndication. In upcoming posts I'll look at potential solutions that can address some (or perhaps all) of these issues.
Download – Your Strategic Guide to Data Governance and MDM