There almost seems to be a perception that the value of big data is only truly realized when we process all of it. The way we talk about big data, while generally optimistic, it has an almost ominous feel behind it. As though, if we fail to tackle all of big data in the next 12 months, Rumplestiltskin will come to take away our first born.
I would challenge anyone who says they can “tackle and resolve” their big data within 12 months that they do not have true big data. They may have a lot of data, and they may not have enough resources to consume it, but they don’t face a challenge on the scale of big data. At the same time, I would say that even those with the biggest of big data can start deriving meaningful value from that data quickly.
Successful delivery against big data requires two general objectives. The first (and usually most focused upon) is the enhanced/improved software and hardware infrastructure that can be used to churn the entire universe of data. This is often an expensive and potentially time-consuming task. If it wasn’t, you probably wouldn’t find yourself taking a second thought about the process and just go along your merry way.
The second aspect required (and one that can be potentially more readily tackled) is a focus on learning and understanding the constituent data. Regardless of the source of your big data, it can all be thought of as the amalgamation of many pieces of “small data.” These individual sources may be nuanced and complex, but when taken on one at a time, the challenge they pose is greatly diminished.
My first job out of school involved the analysis of a large national survey. The volume of data would almost be laughably small compared to sources we look at now, and yet the team dedicated to the analysis of the data was larger than the entire analytics groups for many companies I’ve talked to in the health care industry. Even with this pool of dedicated, highly trained and focused individuals, I don’t think any of us would say that we had mined the data to exhaustion. There was constantly more value we could extract, and likely there always will be.
Certainly we hit points of diminishing return, where new truly insightful discoveries in the data were few and far between. And I know for a fact that my colleagues and I would have loved the opportunity to bring this data together with other sources and see what we could find out – but we didn’t have that luxury. That said, I can’t remember a week going by where someone didn’t have a new idea, method or approach to tackle the data that didn’t bring out a little something new.
The point being that there is value still remaining in the data you can access and evaluate now. I don’t mean to downplay the value of what a big data-enabled platform can deliver, but rather to remind us all that even small, focused, incremental growth in understanding and utilization of the smaller building blocks of data will not only prepare us for reaping the long-term value of the big data conglomeration, but will likely provide meaningful insight along the way.
In Part 3 of this series, I’ll narrow down how to define big data, which is critical to understanding how best to utilize it.