Tackling the architecture dimension

2

When tackling data quality, a lot of companies focus on the problems they can fix in the short-term. If they can clean data they will. If they can fix a broken process they will. If they can improve the logic of data validation logic they will.

Quite often, people won’t stop to think of the broader issues surrounding these problems.

  • Why did the data become invalid in the first place?
  • Why was there a broken process?
  • Why didn’t the validation logic work as expected?

I find that most companies are concentrated on solving data quality dimensions that relate mostly to values, but often the bigger problem is one of architectural failure.

Hand on heart, I truly believe data architecture is one area where we have failed to progress as an industry.

In the eighties I was given a thorough grounding in data modelling at University, and in my first role every data analyst in the organisation was adept at data modelling to a high standard. Whenever we received data from an external organisation we learned to create physical and logical data models that were compared to our enterprise model for gaps.

What I discovered is that over the years, as I moved around various organisations, this same kind of rigour was far less visible. Less emphasis was placed on core architectural techniques such as not only maintaining accurate data models but using modelling as a core technique to design the right data structures for the task in hand. I often meet data analysts today with no modelling skills whatsoever and that is deeply worrying given their importance.

Documentation is another grey area for many organisations. I was initially trained by one of the software engineers who designed flight control systems for commercial airliners. I still remember the "hairdryer treatment" I once got from him when he discovered that my software design documentation had a number of major gaps in it. He came from a background where poor documentation could lead to fatalities and it was a lesson I never forgot.

Far too many organisations pay lip-service to good documentation, and there are always major gaps missing that only come to life when you need to really understand the functionality - perhaps during a data migration or merger and acquisition activity. I think the trend to outsource system development or buy off-the-shelf solutions only exacerbates the problem.

So, with modelling and documentation, we’ve barely scratched the surface of modern data architecture. But it’s clear that many organisations are failing just in these areas alone and as a result are creating data quality defects.

Let’s imagine one of your web developers creates an excellent data validation routine to prevent poor quality customer details being captured. Is that process being documented and shared across the organisation?

When you buy new SalesForce accounts and add Cloud-based solutions to your enterprise architecture, are you reflecting these new systems in your enterprise models and logical models?

When data quality defects are observed to stem from architectural failings are you quickly able to discover copycat issues elsewhere in the organisation from looking at documented architecture patterns?

In summary, now more than ever, organisations need to invest in some of these traditional data management disciplines. Otherwise, you’ll never get off the data quality ferris wheel. If you’re seeing the same type of issues repeated across the organisation, take a look at the broader picture or you’ll be firefighting for a long time to come.

What do you think? Are core data management disciplines like data architecture slipping behind or am I being too critical?

Tags
Share

About Author

Dylan Jones

Founder, Data Quality Pro and Data Migration Pro

Dylan Jones is the founder of Data Quality Pro and Data Migration Pro, popular online communities that provide a range of practical resources and support to their respective professions. Dylan has an extensive information management background and is a prolific publisher of expert articles and tutorials on all manner of data related initiatives.

2 Comments

  1. Dylan: I believe that you are right on the mark with the your blog. Far too many times I have found the root cause of a data quality issue to reside in the underlying architecture. Many times this is due to a lack of understanding regarding the data: its definition, intended usage and context. As I've worked to get data quality integrated into the core data management practices of some organizations, I've had some data architects tell me that data quality is something that happens downstream; they only have to design the structures to store the data, not insure it's quality. Great post, as always!

  2. Dylan Jones

    Hi Karen

    Great comment, thank you for sharing and your experience clearly shows how much work we have to do in order to transform the belief system around data quality.

    I personally believe that poor architecture is one of the single biggest causes of bad data and it seems that you've also witnessed similar issues so it's rewarding to know the post resonates with other practitioners.

    Thanks again, Dylan

Leave A Reply

Back to Top