Integrating master services into the application environment

0

A general paradigm for a master data management solution incorporates three operational components:

  • An identity resolution engine.
  • A master index.
  • A master entity data repository.

Conceptually, the identity resolution engine satisfies two core capabilities: the creation and management of unique identifiers associated with uniquely identified entities, and a matching capability that scores similarity between query data containing identifying information and an existing master index containing information about known entities. In turn, the master index maps unique identifiers to a repository of information associated with known entities. Lastly, the master entity data repository provides an interface for accessing additional information about any unique entity in the system.

From a design and development perspective, this sounds relatively straightforward, with simple data models that are accessed and updated by a server that uses the identity resolution engine to search for matches, look up matched records based on a level of similarity, retrieve the entity data from the repository for matched records, or create new records and add new entities into the index.

However, a peek under the hood reveals some levels of complexity that one might not realize at the start, and that complexity does not become apparent until you start to integrate the master data services with consumer applications. This can be illustrated by posing some questions that might be asked by the owner of a consumer application at the requirements phase of an integration project, such as:

  • Index instantiation: At what point does the master index get built? This is important to know, since there is a little bit of a chicken and egg problem – the identity resolution engine builds the index to manage unique identities, but cannot do so until the index is built.
  • Identifying data: What information does the identity resolution engine need to uniquely identify entities and generate unique identifiers? Since the engine and the index will rely on a combination of data element values to differentiate and/or match identities, those data elements will have to be analyzed prior to designing the data models and tuning the identity resolution engine.
  • Data sources: Where does the master entity data come from? A more pointed question asks whether there is a qualitative difference in the usability of the data based on the source, as that may impact the tuning of the matching algorithms.
  • Replication: Does the master data system make copies of data from the various data sources? How much of that data is replicated?
  • Synchronization: How frequently is the master index updated? This becomes a particularly relevant question for integrating with operational processes. Unless the master index and the master repository are updated at the same time as the operational environment, there are going to be situations in which there is information for entities that has been added in one environment have not yet propagated to the master environment.

And although these questions just scratch the surface of the requirements process, they are actually all related to the topic of our last bullet item: synchronization. Understanding the implications of synchronization is key to both articulating consumer application requirements and engineering ways to satisfy those requirements.

Share

About Author

David Loshin

President, Knowledge Integrity, Inc.

David Loshin, president of Knowledge Integrity, Inc., is a recognized thought leader and expert consultant in the areas of data quality, master data management and business intelligence. David is a prolific author regarding data management best practices, via the expert channel at b-eye-network.com and numerous books, white papers, and web seminars on a variety of data management best practices. His book, Business Intelligence: The Savvy Manager’s Guide (June 2003) has been hailed as a resource allowing readers to “gain an understanding of business intelligence, business management disciplines, data warehousing and how all of the pieces work together.” His book, Master Data Management, has been endorsed by data management industry leaders, and his valuable MDM insights can be reviewed at mdmbook.com . David is also the author of The Practitioner’s Guide to Data Quality Improvement. He can be reached at loshin@knowledge-integrity.com.

Leave A Reply

Back to Top