Alternatives for streamlining access to master data

0

In the past few posts we examined opportunities for designing a service layer for master data. The last note looked at the interfacing requirements between the master layer and the applications lying above it in the technology stack.

Exposing accessibility to master data through the services approach opens new vistas when it comes to master data integration, particularly in terms of load on the system. Again, this sheds some light on the gaps in the “consolidation approach” to MDM when the repository is engineered as a target system, not as both a target and a supplier of data.

In those implementations, the master repository may not be adequately resourced to handle an increasing load of consumer applications retrieving or updating the data. Yet a slow response to application requests for data will prove to be the system’s undoing unless other alternatives are considered for syndicating master data in a way that provides a current, synchronized view of master data while provide rapid enough access to satisfy the business application expectations.

I have been thinking about this for some time, and two ideas pop into mind. The first is replication: making copies of the master repository, publishing those copies to each of the consuming applications and regularly refreshing the repositories. While at first blush this may sound appealing, there are some obvious reasons to question the soundness of this approach. First, extracted copies will be difficult to control – the consuming applications may recast the data into different formats and store that copy locally (internal tables, or copying the data into application tables), creating a risk of asynchrony and inconsistency. It also poses the problem as to the best way to forward updates to the master records as well as how those updates are fed back to the copies and the time frames for doing so.

The second idea is an attempt to address the problems of the first idea by leaving the master data in one place, its original location, by using a data virtualization layer that federates access to the different master data repositories while exposing a unified canonical model to the consumers. This approach eliminates the problems of asynchrony and inconsistency because all the applications are basically looking at the single instance of the master data. And modifications and updates can be transacted through the virtual layer as well, especially if the transactions are serialized “under the hood.”

Both of these alternatives reflect a different foundational approach to MDM – the replication model adapts the centralized hub approach, while the virtualization model mirrors the “transaction hub” approach. There are benefits and drawbacks to both, but in the next posting we will look at a hybrid model that to some extent will satisfy the performance challenge as well as the consistency challenge.

Share

About Author

David Loshin

President, Knowledge Integrity, Inc.

David Loshin, president of Knowledge Integrity, Inc., is a recognized thought leader and expert consultant in the areas of data quality, master data management and business intelligence. David is a prolific author regarding data management best practices, via the expert channel at b-eye-network.com and numerous books, white papers, and web seminars on a variety of data management best practices. His book, Business Intelligence: The Savvy Manager’s Guide (June 2003) has been hailed as a resource allowing readers to “gain an understanding of business intelligence, business management disciplines, data warehousing and how all of the pieces work together.” His book, Master Data Management, has been endorsed by data management industry leaders, and his valuable MDM insights can be reviewed at mdmbook.com . David is also the author of The Practitioner’s Guide to Data Quality Improvement. He can be reached at loshin@knowledge-integrity.com.

Leave A Reply

Back to Top