Re-thinking the design choices of application data quality

1

If we look at how most data quality initiatives start, they tend to follow a fairly common pattern:

  • Data quality defects are observed by the business or technical community
  • Business case for improvement is established
  • Remedial improvements implemented
  • Long-term monitoring and prevention recommended
  • Move on to the next data landscape

Ok, I know not all projects follow that path but for most projects there is a definite sense of resolving the issue many years after the application was originally conceived.

To really make an impact on data quality I think the CDO and CIO office have to start mandating a requirement for data quality principles to be baked into sound application design.

The vast majority of data defects I used to see would stem from poor design choices. Here are some examples:

  • No validation of data feeding into or out of a system
  • Poor search facilities that would lead to users creating duplicate records
  • Lack of referential integrity checks both within a system and with external reference data sources
  • Creating multiple masters instead of definining centrally managed master sources of data
  • Complex forms and lack of user interface design making data entry errors a frequent occurrence

The list is endless and as you profile, assess and improve your own application data you’ll no doubt observe these design flaws (and many others).

But what do most data quality projects do with this information? They simply use it for steering ongoing improvements  instead of getting to the real problem which is often weak application design right from the outset.

I believe the solution is for the CDO or CIO office to step up and recognise that data quality management is not just a post-improvement activity but also a pre-design authority requirement.

Where do you stand on this? Is it something you've implemented in your organisation? What results did you have? Welcome your views below.

Share

About Author

Dylan Jones

Founder, Data Quality Pro and Data Migration Pro

Dylan Jones is the founder of Data Quality Pro and Data Migration Pro, popular online communities that provide a range of practical resources and support to their respective professions. Dylan has an extensive information management background and is a prolific publisher of expert articles and tutorials on all manner of data related initiatives.

1 Comment

  1. Charles Harbour on

    Dylan,

    As always, very thought provoking. I think we're really on the cusp of turning the tables here - to help the management folks understand the cost of not doing these things.

    I wholeheartedly support the concept of baking the quality in, starting with the end in mind, thinking ahead to how your warehouse will grow in the future (and design flexible validation strategies into that design). But, out here in the cheap seats, such 'frivolities' are often cut to reduce costs and improve speed of delivery.

    But I think the tide is turning (at least a little bit), when you balance the costs of having to deal with dirty data for literally years after implementation; of having to re-validate your data before any reports are sent up or out, of still driving strategy based on gut instincts because there aren't reliable numbers to do fact-based decision making.

    The other thing that's helping out here in the field is that our warehouses are getting better - by now, some are in their 3rd or 4th generation, improving MDM and modeling techniques with each iteration, to become more agile (which helps take away that speed of delivery argument). But it's still a struggle - it seems to me that you have to justify every new strategy all over again (Why would we want to have to add code that auto-completes the address fields? We check it afterwards, yes?).

    I hope that by exploring discussion topics like this one, we help make a stronger argument for quality - so that the question becomes - How much time do you need to do this? instead of - Why would we want to spend the extra time and resources?

    Cheers!
    CH

Leave A Reply

Back to Top