In search of the Holy Grail: Sorry Hadoop fans, only MDM delivers single view of customer


Single view of customer. It's a noble goal, not unlike the search for the Holy Grail – fraught with peril as you progress down the path of your data journey. If you're a hotelier, it can improve your customer's experience by providing the information from the casinos and the spa at check-in to better meet your customer's needs. If you're sending out marketing fliers, it can reduce mailing costs by providing a clean list of customer addresses. If you're a retailer, and a customer buys that Monty Python and the Holy Grail DVD, it can increase revenue by recommending additional related products like a "None Shall Pass" t-shirt. And, if you are a federal or state agency, it can help you meet compliance or regulatory requirements for reporting.

Master data management (MDM) solutions, like SAS MDM, are designed to provide that single, consistent view of the customer across multiple sources of data. But with many other technologies claiming the ability to provide a single view, is MDM still the best approach available today?

master data management provides the single view of customer
Master data management provides a single, consistent view of the customer.

I first learned about MDM about nine years ago while interviewing for a job with a Big 6 enterprise software firm. My old drummer was a hiring manager, and he thought I’d be a good fit for a spot on his team as a sales engineer. (See mom, jamming on that guitar for years did pay off.) Wikipedia was my best friend as I ramped up quickly on terms like MDM, customer data integration (CDI), product information management (PIM) and enterprise service hub (ESB). Amazingly, some combination of my computer engineering degree, my buddy and my enthusiasm got me the job.

I spent the next six years becoming an expert in the MDM space. In this world, “single view of the truth,” survivorship rules, and the differences between systems of record and sources of record were often religiously debated. Along the way, I converted a lot of CSV files into XML to load into the MDM hub, and I even wrote a few tools in VB.Net to do so.

Back then, we often heard: “I already have a customer relationship management (CRM) solution or a data warehouse (DW); why do I need MDM?” I would reply, “If all your customer data is one place, you may not need MDM.” But there were many times when MDM was deployed to fix the issue of having multiple departmental CRM, ERP or DW solutions that needed a common set of attributes stored at a cross-enterprise level. That was the only way to obtain what's known as the best record, golden record or single view of the truth.

MDM provides the single view of customerHaving a fragmented view of data silos is a common customer pain point, and the customer domain itself is the most commonly mastered domain in the MDM space. "Mastered" means that some process was put in place to create a set of cross-organizational attributes that are defined as the canonical “best record” for that specific de-duplicated customer. For customer, the master data record or "customer profile" might include name, address, birthday, gender, customer lifetime value, Twitter account, products sold and other attributes.

In the last three years, I’ve been hearing the terms “single view of customer” or “360-degree view of customer” described as a solution that many other technologies can deliver – including customer intelligence, Hadoop data lakes  and data virtualization. Sometimes, to draw from the opening scene of Monty Python and the Holy Grail, these are just two coconuts banging together, and they are not really horses. I'm not saying that these solutions don't provide incredible customer benefits aside from the single view. But all of these solutions or platforms still require some MDM component – whether it's called "Member 360" or "Single View of Citizen" – that brokers and synchronizes data across multiple sources.

Three approaches that fall short of providing the "single view" without MDM

Customer intelligence solutions. CI solutions can provide fantastic insights and optimization about how to best reach your customer and create and execute an omnichannel market strategy. But, under the hood, you have to get the data right. It requires essential data quality entity resolution and master data management survivorship rules to generate a consistent, accurate and de-duplicated list of customer data. Ad hoc entity resolution can be used to create such a master list of customers. Or a more systematic MDM solution can be put in place to manage that data and create the "single view".

Data virtualization. As with our own SAS Federation Server, data virtualization is a fantastic technology that speeds corporate agility and helps users create a blended, secure and abstracted view of data across multiple sources. But it's dependent on the source with the greatest latency, it fails to correct data in the source systems and it might be read-only for very complicated queries that blend multiple data sources.

Hadoop data lakes. Apache Hadoop is the most recent open source technology with the promise to put all of your data in one place and deliver a single view of customer. Its adoption is driven partly by price reduction for storage. On the other hand, Hadoop allows huge volumes of data to be ingested into one repository without requiring specific schemas or structure to be defined beforehand. Can you get all your data into Hadoop? With the right size cluster (think dozens of cheap computers chained together), yes. But, at least today, Hadoop lacks the maturity required in both the data quality, master data management and transactional integrity areas to provide a single view.

Many vendors are creating entity resolution, de-duplication, and big data match capabilities that run on Hadoop. We even have our own built into SAS Data Loader for Hadoop that runs in-memory using Apache Spark. And, while Apache Hadoop may have SQL interfaces like HAWQ or Impala, it was just not designed to optimize transactional and operational access to customer data the way that relational database management systems like Oracle, IBM DB2 or Microsoft SQL Server were designed. It can be a great source of structured and unstructured data as one of the source systems that are blended together and survived to create a "best record" stored in the MDM hub. But it's just not where you want to put the MDM hub. At least not yet.

MDM: Still the way

So, while many technologies may claim to provide a single view of the customer or a 360-degree view of the customer, there really is only one that can deliver the golden record or single view of truth. And that technology is MDM. This is true whether it's homegrown using data quality and data integration solutions, or bought off the shelf and customized with lots of services dollars. The entity resolution, matching  and survivorship rules provided with MDM solutions are essential to providing a single view of  customer, product, location or "X." At least for the time being, they require a relational database engine underpinning as "the hub." This type of hub provides the transactional integrity and speed to integrate in real time with other operational systems, and to reduce the amount of duplicates and dirty data across your landscape.

While we may never actually find the Holy Grail, with the right solutions in place we can at least enjoy the journey.

To learn more about MDM, download the TDWI Checklist Report: Seven Tips for Unified Master Data Management


About Author

Matthew Magne

Principal Product Marketing Manager

@bigdatamagnet - Matthew is a TEDx speaker, musician, and Catan player. He is currently the Global Product Marketing Manager for SAS Data Management focusing on Big Data, Master Data Management, Data Quality, Data Integration and Data Governance. Previously, Matthew was an Information Management Solutions Architect at SAS, worked as a Certified Data Management Consulting IT Professional at IBM, and is a recovering software engineer and entrepreneur. Mr. Magne received his BS, cum laude, in Computer Engineering at Boston University, has done graduate work in Object Oriented Development and completed his MBA at UNCW.

Leave A Reply

Back to Top