For decades, data quality experts have been telling us poor quality is bad for our data, bad for our decisions, bad for our business and just plain all around bad, bad, bad – did I already mention it’s bad?
So why does poor data quality continue to exist and persist? Have the experts been all talk, but with no plan for taking action? Have the technology vendors not been evolving their data quality tools to become more powerful, easier to use, and more aligned with the business processes that create data and the technical architectures that manage data? Have the business schools been unleashing morons into the workforce who can’t design a business process correctly? Have employees been intentionally corrupting data in an attempt to undermine their employers’ success? Wouldn’t any perfectly rational organization never suffer from poor data quality?
One of my favorite nonfiction books is Predictably Irrational by Dan Ariely, which provides a good introduction to behavioral economics, a relatively new field combining aspects of both psychology and economics. The basic assumption underlying standard economics is that we will always make rational decisions in our best interest, often justified by a simple cost-benefit analysis. Behavioral economics more realistically acknowledges that we are not always rational – and, most important, our irrationality is neither random nor senseless, but quite predictable when the complex psychology of human behavior is considered.
The basic assumption underlying most theories of data quality is that since the business benefits of high-quality data are obvious when compared to the detrimental effects of poor quality, then any people, processes or technology that allow poor data quality must either be acting irrationally or otherwise be somehow defective.
Therefore, preventative measures, once put into place, will correct the problem and alleviate any need for future corrective action, such as data cleansing. Everything, and everyone, will then be rational and wonderful in a world of perfect data quality.
However, people are far from perfect and they are often one of the root causes of data quality problems, such as when people assume data quality is someone else’s responsibility. David Loshin has recently been blogging about behavior engineering and behavior modification. I like using the term behavioral data quality to describe the necessary inclusion of aspects of psychology within the data quality profession.
Ariely’s book explains the dangers of not testing our intuitions, thinking we can always predict our behavior, and assuming our behavior will always be rational. Better understanding these flawed perspectives can help us truly better understand the root causes of our predictably poor data quality. Most important, it can help us develop far more effective tactics and strategies for implementing successful and sustainable data quality improvements.