What's the risk in data that doesn't fit?

Can data management reduce your risk exposure? Many managers are quick to point out that they’re awash in data but need help to make sense of it. While this may be true, it’s still worth it to ask: do you have the data you most need to make better decisions? Not all the data needed to make better decisions come from operational systems. For example, decisions about which new product to fund, how much to produce and in which markets to trial it could all require other data sources.

Data are measures. From vital signs when you go to the doctor to the number of times you use an ATM in a month, measurements of all sorts are taking place and being recorded all the time. Data collection, simply, is a process of measurement. When viewed in this way, it makes sense to ask if you are measuring efficiently the things that matter most in your organization to resolve the most important problems and pursue the best opportunities.

Consider the implications of making many sub-optimal decisions with poor data at your organization. Some data quality is so poor, the ability to answer certain questions with any confidence is almost impossible. Yet often, managers continue to direct analysts to spend time munging the data when no better decision may result from the analyses than a random guess.

Most analysts view such work as an exercise in futility and beg to be assigned to other things where they can add value. Their time could be spent on other problems that need their rare and valuable skills. Alternatively, their time could be spent on ways to improve the data collection process going forward—measuring the effectiveness of the process and optimizing it for the most important (identified) uses of the data.

The five widely accepted attributes of data quality—accuracy, integrity, consistency, validity, reliability—should be augmented with a sixth more important uber-attribute as indicated in the Wikipedia definitions for data quality and information quality: data need to be “fit for their intended use.” You will never anticipate every possible use (or misuse) of your data before collecting it. In fact, there is often residual value in using data for purposes beyond the original reason for collecting it (which was usually to send you a bill). New ways of leveraging data will evolve and those scenarios, in turn, will need more data as other questions arise that can be answered with more or different data.

Viewing data collection as an ongoing process would then better position organizations to maximize value from data as a strategic asset--resulting in superior input to all other decision-making processes.

From a risk management perspective, organizations need to understand that in some sense, bad data is like crime. You will never have 0 percent of it. To the extent that it pays to improve bad data to minimize risks of poor decisions (while at the same time augmenting your chances of creating more value), it pays to have data quality metrics and trends in the first place. There is no single measure of data quality, but if you take the time to measure and monitor important attributes of data quality in your organization, you can gage what the potential is for improvement. Are the percent of missing values going up or down? Is the categorical value “other” accounting for an increasing percentage of a given field? And so on.

One solution: Information management or data integration competency centers (when aligned and supported properly) can go a long way in driving improvements in data quality and reducing risk exposure. Analytic Centers of Excellence can complement these efforts and help more strategically close the loop on data quality.

If it's not managed correctly, more data can equal more risk. I hope you'll consider some of the tips here to reduce that risk.

tags: analytic center of excellence, Anne Milley, data management, risk

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <p> <pre lang="" line="" escaped=""> <q cite=""> <strike> <strong>