There are many reasons organizations like Netflix and Amazon can glean fascinating insights into consumer behavior: their data is vast and eerily accurate. That is, employees don't have to spend a great deal of time scrubbing data for location, past purchases, preferences and the like. Put differently, these companies need not speculate on who their customers are, merely what they want.
Big difference.
Lamentably, that's not the case for the vast majority of organizations today. Cleansing and verifying data for even basic reporting – never mind sophisticated analytics – often hamstrings their efforts. Analytics "projects" with ambitious charters and big budgets wind up barely offering anything remotely resembling value. Another one bites the dust.
With respect to data preparation, organizations are loathe to truly empower nontechnical employees for many reasons, some of which are legitimate. (Can anyone say SOX?) Along these lines, David Loshin writes about conventional approaches to data preparation. In his words:
The data warehouse approach balances two key goals: organized data inclusion (a large amount of data is integrated into a single data platform), and objective presentation (data is managed in an abstract data model specifically suited for querying and reporting).
Let me paraphrase. By systematizing data preparation, organizations arguably do two things. First, they minimize the risk of the rogue employee going off the reservation. Second and on a related note, they curtail the risk of uninformed – if well-intentioned – employees making mistakes. In my consulting career, I've seen a few employees run massive purge programs in production only to find out that those programs do indeed work.
Ending the blame game
To be fair, these risks aren't insignificant – but let's not overlook the costs of cementing IT's central involvement in all data preparation efforts. For one, it reinforces the toxic IT-business divide. Beyond that, it frustrates employees and costs the organization invaluable time to capitalize on time-sensitive promotions, opportunities and specials.
Employees with unfettered and direct access to sensitive production data? Insane, right? Perhaps, but make no mistake: This type of access promotes individual employee responsibility. That is, no longer can employees use IT as the organization's whipping boy when data issues manifest themselves. And, all else being equal, better data means better and more meaningful analytics.
Simon says: Embrace self-service.
Don't treat data self-service as a binary: all or nothing. Rather, why not experiment with limited self-service over the course of several months? Why not try an agile approach? Maybe the concerns mentioned here are well founded, not overblown. Maybe an organization's culture mitigates these risks.
Feedback
What say you?
Get a TDWI report on improving data preparation for analytics