Big data lessons from Google Flu Trends

1

The Google Flu Trends application has received negative press since 2013 over its inability to accurately detect flu outbreaks. The latest critique, “The Parable of Google Flu: Traps in Big Data Analysis,” from Science magazine compares Google Flu Trends data to CDC data and dissects where the Google analysis went wrong.

As you might remember, Google Flu Trends was designed to pinpoint flu outbreaks by analyzing search data for flu related keywords. The problem? At least 80 percent of people who conduct flu related searches don’t actually have the flu.

Why does this story fascinate us? Partly because we can relate to it: Most of us have searched Google for medical information, and many of us, at one point or another, have thought we had the flu when we did not. But also because we like complex problems that are hard to solve.

The real lessons, though, are in the analysis. And this story reminds us of some important truths:

  1. Crowd sourced data is dirty data. It needs to be cleaned and managed before using it for any type of official analysis.
  2. Social data is just one data point. Whether you’re working with Twitter, Facebook or Google data, it’s going to be more powerful when combined with other data sources – like CDC data, for instance – and not as a standalone source.
  3. Keep monitoring and evaluating. You can’t just build a model and walk away. You have to monitor results and re-model your data over and over again before you might find an accurate representation of reality.

Be sure to read the Science magazine article for additional (and more scientific) lessons.

Share

About Author

Alison Bolen

Editor of Blogs and Social Content

Alison Bolen is an editor at SAS, where she writes and edits content about analytics and emerging topics. Since starting at SAS in 1999, Alison has edited print publications, Web sites, e-newsletters, customer success stories and blogs. She has a bachelor’s degree in magazine journalism from Ohio University and a master’s degree in technical writing from North Carolina State University.

1 Comment

Back to Top