Is your data quality strategy continually evolving?

One of the significant problems data quality leaders face is changing people's perception of data quality. For example, one common misconception is that data quality represents just another data processing activity.

If you have a data warehouse, you will almost certainly have some form of data processing in the form of 'data cleanup' activity. This cleanup work may involve transforming rogue values and preparing the data for upload into the live warehouse and reporting environment.

In a CRM migration, your data processing may include a stage to deduplicate customer records prior to go-live.

It's easy to see how people view these activities as data quality because they are improving the quality of the final information product. Read More »

Post a Comment

Dynamic data and coalescing quality

In my last post, I pointed out that we data quality practitioners want to apply data quality assertions to data instances to validate data in process, but the dynamic nature of data must be contrasted with our assumptions about how quality measures are applied to static records. In practice, the data used in conjunction with a business process may not be “fully-formed” until the business process fully completes. This means that records may exist within the system that would be designated as invalid after the fact, but from a practical standpoint remain valid at different points in time until the process completes. Read More »

Post a Comment

Big data quality

Utilizing big data analytics is currently one of the most promising strategies for businesses to gain competitive advantage and ensure future growth. But as we saw with “small data analytics,” the success of “big data analytics” relies heavily on the quality of its source data. In fact, when combining “small” and “big” data for analysis, neither should lack quality. That raises this question: how can companies assure the quality of big data? Read More »

Post a Comment

Five data quality archetypes, part 2

This is second post in a two-part series. In the first post, I covered three types of employees with data quality issues: the Ignorant, the Aloof, and the Skeptical. Now it's time to address the other two.

Read More »

Post a Comment

Forming data quality habits in 2015

It’s common at the start of a new year to create a long list of resolutions that we hope to achieve. The reality, of course, is by February those resolutions will likely be a distant memory.

The key to making any resolution stick is to start small. Create one small habit at a time and only move on to the next when it has become a routine in your life.

The desire to bed down multiple "data quality habits" is one of the biggest mistakes I witness in organisations. Too many people try to impose sweeping data quality reforms because "it worked in their last organisation," and the results are often a long drawn out process of resistance and underachievement. Read More »

Post a Comment

Static Models and Dynamic Data

After working in the data quality industry for a number of years, I have realized that most practitioners tend to have a rather rigid perception of the assertions about the quality of data. Either a data set conforms to the set of data quality criteria and is deemed to be acceptable – or the data set fails to observe the levels of acceptability and is deemed to be flawed.

I suspect that our attempt to designate a data set to be of “acceptable quality” (in relation to a discrete assessment) is an artifact of data warehousing, in which a data set is extracted, transformed and loaded as a single, static unit. Quality characteristics are measured en masse to provide an overall score for a static collection of records that are representative of the underlying data model. Read More »

Post a Comment

Big data preparation, big data quality and big data governance, oh my!

This isn't Kansas anymore. Oz has become a sprawling, smart metropolis filled with sensor data. How do we make sense of, clean, govern and glean value from this big data so we can get Dorothy home? The answer is SAS Data Management. With the latest portfolio updates, customers will be able to navigate past the flying monkeys, make the right turns on the yellow brick road and collaborate with the technology-focused tin man of IT like never before.

The enhancements center around three main areas: big data; metadata and data quality. Basically, it’s now much easier for users to access, integrate, cleanse and govern big data and metadata across the enterprise. Read More »

Post a Comment

Crowdsourcing data improvement: Part 1

James Surowiecki wrote a book about The Wisdom of Crowds. Jeff Howe, who co-coined the term crowdsourcing, wrote a book about Why the Power of the Crowd Is Driving the Future of Business. In this blog series, I explore if it’s wise to crowdsource data improvement, and if the power of the crowd can enable organizations to incorporate better enterprise data quality practices.

Let’s start with a definition. Crowdsourcing is obtaining services, ideas, or content by soliciting contributions from a large group of people, most often via the Internet, rather than from traditional employees or suppliers. Contributors to crowdsourcing projects may be unpaid volunteers (e.g., contributing to Wikipedia) or paid freelancers (often via websites such as Mechanical Turk or oDesk), and may have relevant experience or vetted expertise, but more often they have little experience and limited qualifications (one aspect that makes crowdsourcing cost-effective). Read More »

Post a Comment

Five data quality archetypes: Part 1

This is the first in a two-part series.

In my enterprise consulting days, I frequently encountered folks that ran the data quality spectrum. In this post, I'll describe the five main types of individuals with respect to DQ. See if you recognize any of these types.

Read More »

Post a Comment

A few New Year’s data resolutions

Since now is the time when we reflect on the past year and make resolutions for next year, in this post I reflect on my Data Roundtable posts from the past year and use them to offer a few New Year’s data resolutions for you and your organization to consider in 2015. Read More »

Post a Comment