Errors, lies, and big data

My previous post pondered the term disestimation, coined by Charles Seife in his book Proofiness: How You’re Being Fooled by the Numbers to warn us about understating or ignoring the uncertainties surrounding a number, mistaking it for a fact instead of the error-prone estimate that it really is. Sometimes this fact appears to […]

Post a Comment

The Chicken Man versus the Data Scientist

In my previous post Sisyphus didn’t need a fitness tracker, I recommended that you only collect, measure and analyze big data if it helps you make a better decision or change your actions. Unfortunately, it’s difficult to know ahead of time which data will meet that criteria. We often, therefore, collect, measure and analyze […]

Post a Comment

Sisyphus didn’t need a fitness tracker

In his pithy style, Seth Godin’s recent blog post Analytics without action said more in 32 words than most posts say in 320 words or most white papers say in 3200 words. (For those counting along, my opening sentence alone used 32 words). Godin’s blog post, in its entirety, stated: “Don’t measure […]

Post a Comment

Bring the noise, boost the signal

Many people, myself included, occasionally complain about how noisy big data has made our world. While it is true that big data does broadcast more signal, not just more noise, we are not always able to tell the difference. Sometimes what sounds like meaningless background static is actually a big insight. Other times […]

Post a Comment

Being data-driven means being question-driven

At the Journalism Interactive 2014 conference, Derek Willis spoke about interviewing data, his advice for becoming a data-driven journalist. “The bulk of the skills involved in interviewing people and interviewing data are actually pretty similar,” Willis explained. “We want to get to know it a little bit. We want to figure […]

Post a Comment

Building an analytics culture from the ground up

With all the industry emphasis and collateral available on high performance analytics, business intelligence and visual analytics, it can be difficult to know exactly where to begin, especially if you don’t have a team of statisticians standing by. Thankfully, analytics covers a huge range of opportunities to empower your business, and […]

Post a Comment

A double take on sampling

My previous post made the point that it’s not a matter of whether it is good for you to use samples, but how good the sample you are using is. The comments on that post raised two different, and valid, perspectives about sampling. These viewpoints reflected two different use cases for data, […]

Post a Comment

Survey says sampling still sensible

In my previous post, I discussed sampling error (i.e., when a randomly chosen sample doesn’t reflect the underlying population, aka margin of error) and sampling bias (i.e., when the sample isn’t randomly chosen at all), both of which big data advocates often claim can, and should, be overcome by using all the data. In this […]

Post a Comment

What we find in found data

In his recent Financial Times article, Tim Harford explained the big data that interests many companies is what we might call found data – the digital exhaust from our web searches, our status updates on social networks, our credit card purchases and our mobile devices pinging the nearest cellular or WiFi network. […]

Post a Comment

The dark side of the mood

As an unabashed lover of data, I am thrilled to be living and working in our increasingly data-constructed world. One new type of data analysis eliciting strong emotional reactions these days is the sentiment analysis of the directly digitized feedback from customers provided via their online reviews, emails, voicemails, text messages and social networking […]

Post a Comment