Is big data just a source? Or is Hadoop ready for MDM?


business people discuss whether Hadoop is ready for MDMBig data, over the last few years, has evolved very quickly to meet our enterprise needs. We started with a place to store near-real-time data for consumption by the enterprise or other applications. This data was in its rawest form, and conforming took place in a front-end type tool that worked within our big data platform.

We thought, gosh what a great (and cheaper) place to persist our data over time. But, there were very few governance and data management principles applied to this data store. Especially not the kind that we were used to dealing with in the enterprise. That, in itself, made some of us very leery (and nervous) about using this new technology.

That said, we included Hadoop/big data in our solutions where it made sense. Meaning, as a data store to land/stage data for the data warehouse, or pre-analytics. We also considered it for the data consumption layer, knowing that the business rules had to be recreated for each consumer via the query/reporting tools. We also understood that some very involved management and use of data would need to take place. We used Hadoop as a platform to profile our data prior to loading it into a data warehouse or operational system, and it worked very nicely.

Then our implementation teams found a few other things that made us nervous – like, you can’t update a record.  My immediate reaction was “what?”. After settling down from my initial shock, I decided I was good with that – for now. Our team can just change the design to add date to find the most current record (i.e., MAX Effective_Date). At this point, we may have been asking for a characteristic that was difficult to achieve. (But with the way the technology is maturing, it may be available before this blog ever gets posted!).

So far, this technology is progressing very fast. Just about every vendor has Hadoop/big data solutions. But is it ready for MDM?

As we think about that question, consider that MDM has some of these characteristics:

  • The ability to understand and use the authoring systems of master data.
    • If you have three customer systems, you may want different data elements from each system.
    • Merging and de-duplication are pretty standard. So we need a place to do the deduping and a place to land the golden records.
  • The ability to validate and monitor the master data (ongoing and at intervals).
  • The ability to store the data or the metadata to create a complete master data record.
  • Ease of use (Ask: does it meet your business needs?).
  • Flexibility of change (Ask: Does it meet your future business needs?).
  • An easy way to call the master data or receive the master data for enterprise processes/systems.
  • Security (Ask: Is there enough security to protect our data?). This requirement is coming along nicely, but still makes me nervous with some corporate data (i.e., financials).

As the technology changes and incorporates more of those data governance and other data disciplines that have made our organizations successful, my opinion will change too. I believe that Hadoop/big data is ready to start taking on MDM and other enterprise-critical applications.

I guess we have to wait and see how things unfold from here.

Download an e-book: An Early Adopter's Guide to Hadoop

About Author

Joyce Norris-Montanari

President of DBTech Solutions, Inc

Joyce Norris-Montanari, CBIP-CDMP, is president of DBTech Solutions, Inc. Joyce advises clients on all aspects of architectural integration, business intelligence and data management. Joyce advises clients about technology, including tools like ETL, profiling, database, quality and metadata. Joyce speaks frequently at data warehouse conferences and is a contributor to several trade publications. She co-authored Data Warehousing and E-Business (Wiley & Sons) with William H. Inmon and others. Joyce has managed and implemented data integrations, data warehouses and operational data stores in industries like education, pharmaceutical, restaurants, telecommunications, government, health care, financial, oil and gas, insurance, research and development and retail. She can be reached at

Related Posts

Leave A Reply

Back to Top