Virtualizing the master data replicas

0

Last time we discussed two different models for syndicating master data. One model was replicating copies of the master data and pushing them out to the consuming applications, while the other was creating a virtual layer on top of the master data in its repository and funneling access through a data virtualization framework.

The benefit of the replication model is that it can scale to meet the performance needs of all the downstream consumers, at the risk of introducing asynchrony and inconsistency. The benefit of the virtualization approach is synchronization and consistency, but at the risk of creating a data access bottleneck. Either may be satisfactory for certain types of applications, but neither is optimal for all applications.

There is, however, a hybrid model that blends these two models: selectively replicating the master repository, maintaining a consistent view via change data capture, and enabling federated access via a virtualization layer on top of the replicas. In this approach, the repository can be replicated to one or more high-performance platforms (such as on a Hadoop cluster), with each instance intended to support a limited number of simultaneous client applications.

The virtualization layer can manage access to multiple replicas and provide elasticity in balancing the requests to different replicas as the load increases or decreases. Updates can be channeled through the source master environment, as any changes will be forwarded to the replicas within a well-defined time window.

Sounds good. The next step? Making sure all the pieces work together. Do the data virtualization tools provide seamless access to Hadoop-based systems? And how easy is it to replicate in a controlled manner? These are some of the next sets of questions when considering master data integration at the enterprise level.

Share

About Author

David Loshin

President, Knowledge Integrity, Inc.

David Loshin, president of Knowledge Integrity, Inc., is a recognized thought leader and expert consultant in the areas of data quality, master data management and business intelligence. David is a prolific author regarding data management best practices, via the expert channel at b-eye-network.com and numerous books, white papers, and web seminars on a variety of data management best practices. His book, Business Intelligence: The Savvy Manager’s Guide (June 2003) has been hailed as a resource allowing readers to “gain an understanding of business intelligence, business management disciplines, data warehousing and how all of the pieces work together.” His book, Master Data Management, has been endorsed by data management industry leaders, and his valuable MDM insights can be reviewed at mdmbook.com . David is also the author of The Practitioner’s Guide to Data Quality Improvement. He can be reached at loshin@knowledge-integrity.com.

Leave A Reply

Back to Top