Everything is "big data" these days – but where is the data management?

1

When I first began working in the data management industry in 2003, I interviewed a VP of IT for a Fortune 500 manufacturer about their data quality and data integration initiatives. The executive was excited to talk about their rather novel approach to a massive ERP system integration. She mentioned that instead of just dumping data into a new ERP system, they were taking 18 months to investigate the quality of their information, write some rules to guide the management of data over time and monitor those rules within the ERP system.

That doesn't seem very groundbreaking, does it? They simply wanted to make sure they didn't saddle end users and analysts with unreliable and inaccurate data. Yet, at the time, the idea that IT and business would agree to delay the implementation to address data issues was not a very common concept. Data management was too often an afterthought, but this executive – and other key stakeholders within the organization – realized that the success or failure of that ERP project was the data within the system.

Today, if you read IT publications and websites, the concept of a "simple" ERP implementation seems quaint. Everything is "big data" these days, with information flooding the enterprise from social channels, trading partners, radio tags, etc. As Jill Dyche pointed out in a recent Information Management article, "Big Data's Three-Legged Stool," the amount and complexity of data may have changed, but data is still data. I'll let Jill take it from here:

Acquiring specialized technology and maturing analytics behaviors aren’t easy. But what people don’t know — at least, not yet — is that the hard part of big data is managing it. The challenges of identifying and sourcing the data, applying data correction rules, circumscribing usage, access, and storage policies, and provisioning the data to other platforms and applications requires its own set of rigor. Regulatory requirements mandate that your bank mask Social Security numbers before availing half a billion credit card transactions to hungry data scientists hoping to fortify themselves on fraud indicators. Simply applying a new file system and some statistics to the problem without first applying business rules to the data can mean large fines and, maybe worse, additional regulatory scrutiny.

Sounds very similar to my conversation 10 years ago. This time, instead of looking at enterprise applications, we're talking about pedabytes of data from a host of sources stored in more powerful appliances designed to store mind-bending amounts of data.  But will a big data platform do any better than an ERP or CRM system filled with questionable data? Are we creating a bigger mess here? Jill?

The promise of big data analytics is as expansive as our imaginations. But I’ve also seen the garbage-in, garbage-out phenomenon writ large on the balance sheets of naïve executive teams. Solid data governance and data management processes can mean the difference between new legacy technologies and innovative business actions.

Yes, just like in 2003 (and in 1993 and 1983), data only has value when it means something. Big data is a great thing, but the phenomenon will also need all of the data quality, data integration and data governance skills that we can muster.

Soon, SAS will release the findings from a big data survey. The survey shows that while big data awareness is high, preparedness and strategies lag behind. The same happened with enterprise applications a decade ago, and many organizations paid dearly for going to fast – or moving too slow. Big data will be no different.

Share

About Author

Daniel Teachey

Managing Editor, SAS Technologies

Daniel is a member of the SAS External Communications team, and in his current role, he works closely with global marketing groups to generate content about data management, analytics and cloud computing. Prior to this, he managed marketing efforts for DataFlux, helping the company go from a niche data quality software provider to a world leader in data management solutions.

1 Comment

  1. Pingback: The value of data quality - Information Architect

Leave A Reply

Back to Top