Introducing a new data quality dimension: Irrelevance

0

Do you want to know what one of the single largest causes of bad data is?

Irrelevant data.

Irrelevant data has no value or place within your business, yet for many reasons it is still being maintained (badly).

It accounts for huge amounts of physical real estate within your data landscape and, if left ignored, it can skew your data quality assessments, confuse the heck out of knowledge workers and cause unnecessary bloat within your systems.

Where does irrelevant data come from?

There are numerous reasons for the creation of irrelevant data, but here are some of the most common:

  • COTS: Custom-off-the-shelf systems often cater to the broad industry. As a result, there are often many screens and data structures that simply don’t relate to all organisations.

  • M&A inheritance: Legacy systems are integrated or migrated from newly acquired or merged companies.

  • Shifting business models and processes: Over time, your underlying business processes have changed, but these changes haven’t been reflected in the data.

  • Lack of archival strategy: Old data and data structures are not routinely moved from systems.

Do any of the above scenarios apply to your organisation? If they do, then welcome to the “Data Irrelevance Dimension.”

What are the benefits of removing irrelevant data?

By removing irrelevant data you get immediate gains such as increased query performance and reduced storage requirements. However, the main benefits come from “decluttering” some of the baggage that is slowing down your core user processes.

For example, imagine that you have inherited a COTS system that has field entries for international addresses, yet you only deal with domestic markets. Having to tab past redundant fields can slow up call centre staff or lead to information going in the wrong fields.

Perhaps you’ve inherited an asset management system as a result of a merger. The acquired organisation had a slightly different business model that stored site information for health and safety reasons that isn't applicable to your organisation. The data remains in the system, and when the original organisation's data is migrated into the acquired system you now have irrelevant data that persists.

Getting rid of irrelevant data simplifies user processes and operational performance. These are more than enough reason to explore its removal, but where should you start?

How do you get rid of irrelevant data?

I typically approach this with the full support of the business. You need to get them bought into the process.

First, profile the data and look for trends and obvious events in the lifetime history of the data. You’ll often see fields that have not been updated for several years. By having business experts with you there is more chance of spotting occurrences of irrelevance.

Defining what data you need is another critical activity. Getting the business community to agree on common terms and definitions can help in determining what data should be deleted.

Performing functional modelling exercises is another great way of performing a gap analysis of what data and functions you have compared to what you really need. This also helps validate your current business model and business processes.

Somewhat more challenging is removing screen design elements that map to redundant data structures. This can be problematic on COTS solutions where updates to screen and application design are often not supported. However, most modern systems allow you to customise a lot of the application infrastructure. (Hint: This ability to customise an app should also form part of your search for any new COTS solutions).

You will also need to perform an information chain analysis exercise to see which systems depend on the irrelevant data or data structures. Applications and ETL scripts can fail quite easily when the underlying schemas and information sources are changed in an uncontrolled manner. Create an amnesty for downstream users of redundant data structures but give them a deadline for when they need to come forward.

How have you got rid of irrelevant data in the past? How did it impact your organisation?

Welcome your experiences in the comments below.

Tags
Share

About Author

Dylan Jones

Founder, Data Quality Pro and Data Migration Pro

Dylan Jones is the founder of Data Quality Pro and Data Migration Pro, popular online communities that provide a range of practical resources and support to their respective professions. Dylan has an extensive information management background and is a prolific publisher of expert articles and tutorials on all manner of data related initiatives.

Leave A Reply

Back to Top