Junk drawers and data analytics


In the era of big data, we collect, prepare, manage, and analyze a lot of data that is supposed to provide us with a better picture of our customers, partners, products, and services. These vast data murals are impressive to behold, but in painting such a broad canvas, these pictures might be impressive works of art, but sketchy splotches of science beset by statistical and systematic errors as well as sampling bias. And when taken out of context the analysis of all this data may only reveal superficial correlations, not deep insights.

You might be able to learn something about me, for example, by analyzing the contents of the junk drawer in my kitchen. In it you would find not only a few dozen paper clips and rubber bands of various sizes, several dollars worth of loose change, and coupons, most expired, for local restaurants and shops, but also a wide assortment of other items. Among them would be frequent flyer cards for just about every major airline and loyalty cards for just about every major hotel chain and car rental agency, player cards for half the casinos on the Las Vegas strip, three wristwatches, battery chargers for half a dozen mobile phones, instruction manuals for numerous small electronics, boxes of business cards from my last five jobs, and maps of local hiking and biking trails.

Analyzing the contents of my junk drawer could lead you to draw a few conclusions about me. It would seem safe to conclude that I travel a lot, gamble frequently, always wear a wristwatch, regularly buy new mobile phones and gadgets, constantly change jobs, and enjoy spending lots of time in the great outdoors.

While some of these conclusions are true, others are misleading for a variety of reasons. For one thing, some of this information is outdated. I used to be a full-time business traveler, but I rarely travel nowadays. In fact, last year I didn’t travel at all. The last two times I went to Vegas, I didn’t gamble at all. I haven’t worn a watch in five years and haven’t bought a new mobile phone or gadget in two years. My business cards are misleading since one of the companies I worked for was acquired twice, so I have three sets of business cards for the same job that have different company names and job titles on them. And my knowledge of the extensive network of trails in close proximity to my house has not motivated me to go hiking or biking on a regular basis.

I am not saying big data is the equivalent of the enterprise’s junk drawer. Nor am I discounting the reality that sometimes noise is needed to strengthen signal. However, just as our junk drawers become a collection of things we keep just in case (otherwise we would have thrown them in the trash), sometimes we include more data in our analytics just in case it helps us discover more insights. Just beware that the results are occasionally more flotsam than findings—or at least the type of findings that find their way into a drawer never to be looked at again.

>> Find out what companies are doing to manage their data more effectively in this e-book, Data Management: What You Need to Know and Why)


About Author

Jim Harris

Blogger-in-Chief at Obsessive-Compulsive Data Quality (OCDQ)

Jim Harris is a recognized data quality thought leader with 25 years of enterprise data management industry experience. Jim is an independent consultant, speaker, and freelance writer. Jim is the Blogger-in-Chief at Obsessive-Compulsive Data Quality, an independent blog offering a vendor-neutral perspective on data quality and its related disciplines, including data governance, master data management, and business intelligence.

Related Posts

1 Comment

  1. Ellen Williams on

    Clever comparison. Not all data is created equal! While "big data" is monopolizing the headlines it's great to be reminded that data, just for the sake of data, is not useful. It's about meaningful data, and much of that should be harvested from our customers, not on their behalf, or what we think we can conclude from a glimpse into their lives.

Leave A Reply

Back to Top