I’ll admit I am particularly fond of a saying, “Begin at the beginning.” All too often we get ahead of ourselves when trying to tackle a problem. And without a clear understanding of the full scope of a problem, there’s always the risk of making it worse.
Something like this is happening in the area of business analytics. In the search for a single version of the truth, it is necessary to corral data from any number of data sources throughout an organization. Typically the process also involves effectively cleansing and prepping this data for relevant analysis – all with the overall goal of preparing useful reports that provide business insight that aids in fact-based decision making.
It’s never a problem getting people excited about the super-cool, fact-based decision making part. But often those same people are less concerned about how you get to that end result. Suffice to say the “plumbing” of data integration and data quality (to borrow the fixture analogy of my esteemed colleague, Ken Hausman) is perceived as less sexy.
And yet, the data integration/data quality part is undeniably the beginning of a business analytics project.
In February 2009, Computerworld released a SAS-sponsored report entitled, Defining Business Analytics and Its Impact on Organizational Decision-Making. It is an insightful report asking responders to define the term “business analytics” and the technologies associate with that term. Notably, 73 percent of responders view business analytics as a function of both IT and business departments – the evolving relationship between these two groups is a fav subject I’ve written about previously.
The results also indicate a consistent trend I’ve seen in a number of research reports from the past few years, including:
- The SAS report, Business Intelligence Maturity and the Quest for Better Performance found that “close to 80 percent of organizations have not fully implemented practices to ensure data quality, integrate their data across the business or create standard data definitions – fundamental elements of maximizing business information.”
- In another Computerworld project, Information Management Initiatives at Midsize and Large Organizations, a combined 55% of responders listed either “integrating disparate systems, standardizing data management processes, data quality or data access” as the key barrier to their information management efforts.
- In the current Computerworld report, 59 percent of responders named “data integration with multiple source systems” and 56 percent named “data quality” as the key technology or business challenge to business analytics implementations.
See a pattern?
This trend is fleshed out even more when I talk to folks about their various business analytics implementations. From the outset, they don’t necessarily have a strong sense of where the most important data lives within their organizations or how many versions of that data exist. They typically underestimate how much time and ingenuity it may take access that data. And almost without fail, data quality issues threaten to consume the project. And we all know the adage - garbage in, garbage out.
Data integration is the oversight. Yet it continues to surface as the proverbial thorn in the collective side. Data quality issues haunt the professionals responsible for building the systems and the executives who want so desperately to rely on the reports those systems generate.
So how does something so basic, so fundamental, get overlooked? Planning how data is extracted from multiple sources and how that data is deduped, standardized, profiled, etc. – it would seem that is the beginning isn’t it? But we know that it is all too commonly skipped or rushed… and sooner or later, revisited.
Do you have a story about data integration/data quality gone-awry? Thoughts on why some are so averse to investing the time up front to make sure the data is as solid as possible before analysis and reporting? I’d love to hear your opinions.