How is MDM different from data warehousing?

business people discuss MDM and data warehouses
Read an article about data lakes and data warehouses – Do you know the difference?

On almost all of my master data management (MDM) consulting engagements, someone on the client team inevitably asks how MDM is different from data warehousing. This question is both an understandable and important one.

It’s usually not that people are confused about MDM’s focus on master data, as opposed to reference data or transaction data. But just for clarification, master data represents key business entities such as assets, locations, products and parties. And most often when party master data is discussed, the focus is on one of its roles – such as customer, supplier or employee.

Here’s where the confusion often ensues. The names of party roles often correspond with the names of data warehouse tables – where customer, supplier and employee are separate dimension tables or base tables related to fact tables or transaction tables. Herein lies the basis of the question: Why does MDM need to model and manage this data when the data warehouse already does that?

One key difference is where the source data comes from. Data warehouse sources are often operational or transactional systems. In these types of systems, the master data comes along for the ride when an event or transaction occurs, such as a change in product inventory levels or a customer making a purchase. MDM often incorporates all possible master data sources, including not only data associated with or generated by internal systems, but also external data.

Another major difference between MDM and data warehousing is that MDM focuses on providing the enterprise with a single, unified and consistent view of these key business entities by creating and maintaining their best data representations. While a data warehouse often maintains a full history of the changes to these entities, its current view represents the last update. Plus, each data warehouse update is applied to the current view without a re-assessment of how previous updates might change the best representation.

Matching and consolidating related records doesn’t typically occur in data warehousing. MDM, on the other hand, standardizes, matches and consolidates common data elements across all possible data sources for a subject area to iteratively refresh and maintain its best master record.

“The motivation for MDM,” Evan Levy explained, “is to provide access to a subject area master record along with the details of the contributing sources.” Unlike a data warehouse, which provides a central repository of enterprise data (and not just master data), MDM provides a single centralized location for metadata content. This enables developers and business users to understand the origins, definitions, meanings and rules associated with master data.

Download a TDWI paper: Data Warehouse Modernization in the Age of Big Data Analytics

About Author

Jim Harris

Blogger-in-Chief at Obsessive-Compulsive Data Quality (OCDQ)

Jim Harris is a recognized data quality thought leader with 25 years of enterprise data management industry experience. Jim is an independent consultant, speaker, and freelance writer. Jim is the Blogger-in-Chief at Obsessive-Compulsive Data Quality, an independent blog offering a vendor-neutral perspective on data quality and its related disciplines, including data governance, master data management, and business intelligence.

Leave A Reply

Back to Top