In his recent post Measuring is Intrinsically Fuzzy, my friend Jim Harris writes:
the quality of data is not fixed and is subject to numerous variations despite a high-quality information management process.
We often look for errors in the process in order to improve quality, but we rarely look for errors in the way we measure quality. A fundamental flaw in our methodology would be to not acknowledge that measuring is intrinsically fuzzy.
You'll get no argument from me. In fact, I'd argue that fuzziness isn't going away anytime soon.
Yes, we have entered the era of Big Data. In fact, it's in full swing and we'll soon be able to quantify just about anything. Wearable technology, the quantified self movement and the Internet of Things collectively mean that we'll generate more and more data - and, importantly, so will machines. For two fundamental reasons, don't think for a moment that Big Data will end uncertainty as we know it.
Two Big Limitations of Big Data
First, Big Data nearly only reduces uncertainty; it does not eliminate it. We might be able to explain more of what's happening. That's hardly the same as explaining all of it, much less why.
Which brings me to the second big limitation of Big Data. Big Data means that there's even more potential for statistical abuse. But don't listen to me. Former NBA head coach Stan Van Gundy recently spoke at the MIT Sports Analytics Conference and "sounded off on being fed advanced stats without proper context." (He was no fan.)
Van Gundy is absolutely right. There's no question that analytics can move the needed in any field. (Billy Beane and Bill James might have been the first to introduce the game of baseball to different measures, but today they have plenty of company in the world of sports.) At the same time, though, metrics for the sake of metrics is of questionable value. And let's not forget the ability for some to abuse or honestly misinterpret statistics.
Simon Says
As Ronald Coase once said, "If you torture the data long enough, it will confess." Remember those words when presented with people who claim that data - even the big kind - is perfect, omniscient and clairvoyant.
Feedback
What say you?