Which comes first, data quality or data analytics?

chicken peeking out of egg While it’s obvious that chickens hatch from eggs that were laid by other chickens, what’s less obvious is which came first – the chicken or the egg? This classic conundrum has long puzzled non-scientists and scientists alike. There are almost as many people on Team Chicken as there are on Team Egg, meaning there are almost as many people who believe the chicken came first as there are people who believe the egg came first.

It turns out, however, the yolks on Team Chicken since the answer is ... the egg came first.

This YouTube video does a great job of explaining why, but here are the basics. Chickens, like all species, came to be chickens through the long, slow process of evolution (i.e., gradual changes in DNA over long periods of time). In the reproductive process of animals like chickens, male and female DNA combine to form a zygote, the first cell of a new offspring that divides to create all the cells of a complete animal, with every cell containing exactly the same DNA, all of which came from the zygote. Chickens evolved from chicken-like birds over time through gradual changes (aka mutations) in DNA that created a new zygote capable of producing the first chicken. Since the zygote is the only place where DNA mutations could produce a new animal, and the first chicken zygote was housed inside the egg laid by a chicken-like bird, the egg must have come before the chicken (i.e., the egg came first).

How does this fowl dilemma relate to the pecking order of data management disciples? Well, to me at least, it seems reminiscent of another classic conundrum:

Which comes first, data quality or data analytics?

Historically, there have been a lot more people on Team Quality than Team Analytics, meaning there were many more people who believed data quality comes first than there were people who believed data analytics came first. This makes sense. After all, analytics based on poor quality data can lead to bad business decisions. For example, geographical profiling of customers based on inaccurate postal address data provides a false impression of where the most valuable customers live and can drive bad business decisions such as where to focus marketing efforts. Data scientists, before they can work their statistical, algorithmic and mathematical magic, often cite data preparation – which includes data quality assessment and improvement – as their most time-consuming task.

“The egg of course,” was Richard Dawkins answer to the chicken or the egg question, explaining “the chicken is only an egg’s way of making another egg.” The data management equivalent might be to say data quality of course since analytics is only a way of making more data, which to be valuable has to be of high quality.

It would therefore seem that non-data-scientists and data scientists alike should all be on Team Quality, believing that data quality comes first. Big data, however, is the 800-pound chicken/egg in the room. (Big data is a chicken when viewed as a source producing enormous quantities of data, but big data is an egg when you consider its real value is determined by the quality of the zygote of insight contained within.)

While I am still on Team Quality in general, there are times when big data puts me on Team Analytics. Sometimes analytics must be used to evaluate big data to determine its applicability to specific business problems. Analytics, in this context, acts as an advanced filter – it identifies the most valuable big data before significant resources (time, money, people) are invested in data quality assessment and improvement. Therefore, during this increasingly common scenario, data analytics actually comes first.

What say you?

Are you on Team Quality or Team Analytics? Share your perspective on the relationship and prioritization between data quality and data analytics, and the impacts big data has had on this, by posting a comment below.

3 Comments

Bhaskar lakshmikanth on April 18, 2016 12:05 pm

You are absolutely right. Data Quality comes first and is very critical for either using the data for some analysis or to use purely for analytics.
Bojraj on March 15, 2017 11:37 pm

Analytics comes first. How do we know data is missing the quality without analyzing it first, regardless of the context?
Atul pandey on October 30, 2019 4:50 am

If I tell you my opinion then data quality will come first itself. because in nowadays data quality matters more. Every business focus on the data quality and then they use to prefer data analytics. I may be wrong. but it should be happening in this hierarchy.

Blogs

Blogs

Which comes first, data quality or data analytics?

Which comes first, data quality or data analytics?

What say you?

About Author

Related Posts

3 Comments

Blogs

Which comes first, data quality or data analytics?

What say you?

About Author

Related Posts

Steps to building regulatory readiness for the next wave of clinical submissions

Enterprise AI agents: Requirements for reliable data access

What a modern governance platform looks like – and how to choose the right one

3 Comments