Bring the noise, boost the signal


Many people, myself included, occasionally complain about how noisy big data has made our world. While it is true that big data does broadcast more signal, not just more noise, we are not always able to tell the difference.

Sometimes what sounds like meaningless background static is actually a big insight. Other times noisy big data just seems to make it more difficult for us to know good information when we hear it.

However, we might gain a greater appreciation for the noise buzzing within big data if we paused for some quiet contemplation about the critical role that noise plays in helping us hear what we want to hear.

In their book A Perfect Mess: The Hidden Benefits of Disorder, Eric Abrahamson and David Freedman discussed a few lessons from the history of the data quality of audio. One was why background noise was intentionally added to cell phone calls to improve consumer satisfaction. Silencing background noise gave cell phone users the uneasy feeling of a lost connection or dropped call whenever pauses occurred during a conversation. Added background noise is called comfort noise because our brains are discontented by its absence.

“All of humanity’s problems,” Blaise Pascal wrote in the 17th century, “stem from man’s inability to sit quietly in a room alone.” The 21st century equivalent might be our inability to sit quietly in a room without a wireless connection to all that big data promises to offer. It seems we will always need some form of comfort noise.

Another audio data quality lesson recounted by Abrahamson and Freedman was how Albert Einstein’s most-cited paper was not about the theory of relativity, but the theory of Brownian motion and the concept of stochastic resonance, whereby weak signals can be detected by adding white noise. White noise is a random signal that contains a wide spectrum of frequencies. A weak signal can be boosted by adding white noise because the frequencies in the white noise corresponding to the weak signal’s frequencies resonate with each other, amplifying the signal and making it easier to detect. Essentially, randomness is used to detect order.

Big data is also a great white noise generator. We could consider our search queries weak signals amplified by the search results returned from the wide spectrum of frequencies emanating from the World Wide Web. If you want to add some true randomness, try — it adds a random word to your search to improve the results.

Another example is noise-canceling headphones, which use active noise control, which reduces unwanted noise by the addition of a second noise specifically designed to cancel the first. Noise-canceling headphones cancel low-frequency noise, such as the constant hum of airplane engines, while allowing you to hear higher frequency signals, such as safety announcements from the flight crew.

While many of us might believe big data needs more active noise control (after all, that is essentially what an email spam filter is), I have previously blogged about how innovative environments are deliberately noisy. So maybe we should just let big data bring the noise. It might be just what we need to boost the signal.


About Author

Jim Harris

Blogger-in-Chief at Obsessive-Compulsive Data Quality (OCDQ)

Jim Harris is a recognized data quality thought leader with 25 years of enterprise data management industry experience. Jim is an independent consultant, speaker, and freelance writer. Jim is the Blogger-in-Chief at Obsessive-Compulsive Data Quality, an independent blog offering a vendor-neutral perspective on data quality and its related disciplines, including data governance, master data management, and business intelligence.

Leave A Reply

Back to Top