A panel discussion at the recent International Data Quality Summit opened with the seemingly straightforward request by the moderator for the panelists to begin by defining data quality. The resulting debate was blogged about by Ronald Damhof, who was one of the panelists.
On one side of the debate was the ISO 8000 definition of data quality that posits “quality data is data that meets stated requirements.” Damhof doesn’t agree and offered an alternative definition that I will slightly paraphrase as “quality data is data that has value to some person at some time.”
Damhof’s point was that not only is quality relative, but it varies over time. “Something that is perceived as high quality by someone now,” Damhof explained, “can be perceived as being of low quality a year later. So quality is never in a fixed state, it is always moving, fluently through time.” Furthermore, Damhof argued that it is possible to “meet stated requirements (voiced by person X at time Y) but still deliver a crappy quality.” On that point, I’ll use what I love as much as data—coffee (after all, data is the new coffee)—to explain why I agree with Damhof.
I require coffee as soon as I wake up in the morning. This has been a requirement of mine for over 25 years.
I can get quite cranky when this requirement is not met. Case in point, this past Sunday morning I awoke to discover that my coffee maker was not working. My ensuing tantrum may have lead my neighbors to believe that I had discovered the dead body of a loved one on the kitchen floor. In my coffee-deprived delirium, I stumbled like an undead zombie to the nearest open store to purchase a new coffee maker. After getting it home and brewing some fresh coffee, I drank several cups while my reanimated corpse did a happy dance.
It is important to note that so far meeting my requirement for morning coffee has said nothing about its quality.
To Damhof’s first point, my coffee quality standards have changed over time. I grew up drinking drip-brewed percolated coffee from pre-ground beans. In my early twenties, I became one of those coffee snobs who insisted on grinding the beans myself and steeping them in a coffee press. In my late twenties and early thirties, I extended my coffee snobbery to mostly drinking only espressos and lattes. In my late thirties, I surprisingly started slumming it with instant coffee. Now in my forties, I prefer coffee makers that use those single-serving coffee pods.
It’s Damhof’s other point I find most intriguing—you can meet stated requirements but still deliver crappy quality.
One reason could be low quality standards, such as when I found instant coffee sufficient to meet my requirement for morning coffee. Another reason, quite common with big data, is bad data is as good as you can get at the time.
I say a bad cup of coffee is better than no coffee at all. Charles Babbage, the grandfather of computer science, said, “errors using inadequate data are much less than those using no data at all.”