Data stewardship in a big data world

0

Earlier, I wrote about the Hadoop-centric view of big data that was evident in some recent research conducted by TDWI. If you haven't read the report yet, make sure you download "Managing Big Data," which gives a fantastic overview of "big data management" and how it's impacting organizations.

Like any good piece of research, there were many takeaways. After going back to the report, I found some other interesting nuggets, particularly as we count down to Data Stewards Day (don't forget to submit your nomination by Nov. 4!). It got me thinking: what is the role of data stewardship in the big data world? And how will the data steward role change and evolve?

Data stewardship came up several times in the TDWI report, as those involved with managing big data are toying with the idea of adding big data initiatives to their existing data governance ecosystem. In fact, when asked what problems are hindering big data management, a lack of stewardship or governance was cited by 33 percent of participants – second only to inadequate staffing or skills.

What struck me, as it has throughout the big data "craze," is how familiar much of this sounds. My first boss in the technology industry once told me: "There are no new trends. Just retreads of old trends." His point was that many of the things we're seeing now as "hot, new trends" are similar to what's happened in the past.

In fact, I think that the proliferation of big data can lead to an even messier data explosion than we saw with the boom of enterprise applications in the late 90s. Remember when every company had individual (and often multiple) CRM, ERP or other business systems? These efforts created silos of data, which oddly enough led to even bigger CRM or ERP implementations. In many ways, this phenomenon (at least partially) led to the need for data quality, data governance and data stewardship in the IT world. It was a time of enormous data growth, and we had to figure out a way to manage the information.

Fast forward to 2013. Organizations are starting a variety of big data projects, pulling in social media data, exploring machine learning, and generally aggregating more data than they ever had before. In the TDWI report, 30 percent of companies said they had no strategy for managing big data, but they needed one. Another 20% said that they are deploying new technologies to manage big data. Big data work is happening. People are buying tools. But it may not be happening in a strategic way.

Where does that leave a data steward? Best case: the data steward will get a new role in "big data management," fighting side-by-side with the data scientist to make sense of a newer, more complex and bigger set of data. That's a possibility, and for data stewards who like a challenge, I can think of no better fight.

While it may make sense to apply data governance principles to big data sets, there is an opposing view. In a recent SearchDataManagement article, some industry watchers suggested that "the nature of big data applications doesn't lend itself to heavy doses of governance and data stewardship in the first place." The rationale is that part of the power of big data is its raw "bigness." By introducing data quality or data management principles, you remove the purest form of the data, which might be what the data scientist really wants.

This may be true, especially in the initial phases of big data adoption where the focus is on testing and exploring. The TDWI report echoed this. When asked what data management disciplines or teams are involved in big data, the top four answers are all focused on the aggregation and reporting of data – BI/data warehousing, data integration, database administration and enterprise data architecture. Data quality and data governance are just below that, with only 33 and 30 percent "strongly involved," respectively.

Jill Dyche, who heads up the SAS Best Practices team, said in the SearchDataManagement article that "a data steward might play more of a SWAT team role." Here, the thinking is that people undertaking big data projects may have a defined need for data management work, and a data steward would have the background to fit that purpose.

That leaves data stewards in a familiar place: with a quasi-defined role in the big data world. But, when you think that just 10 years ago, data stewardship wasn't even a job role, a little uncertainty shouldn't phase you. The role will continue to evolve and mature. As big data becomes part of the traditional IT infrastructure, you will likely see a new phase of data stewardship.

And perhaps in 2018, we'll be celebrating "Big Data Stewards Day."

 

Share

About Author

Daniel Teachey

Managing Editor, SAS Technologies

Daniel is a member of the SAS External Communications team, and in his current role, he works closely with global marketing groups to generate content about data management, analytics and cloud computing. Prior to this, he managed marketing efforts for DataFlux, helping the company go from a niche data quality software provider to a world leader in data management solutions.

Leave A Reply

Back to Top