With so much information available about high-performance analytics, business intelligence and visual analytics, it can be difficult to know exactly where to begin, especially if you don’t have a team of statisticians standing by. I'm frequently asked by customers who hope to take advantage of analytics how to get started. How do
Tag: data science
As a data scientist, I have the rare privilege of possessing the job title that Tom Davenport and others have dubbed the sexiest job in the 21st Century. As this popular job title catches on, I’ve even noticed a trend where customers make direct requests for help specifically from “the data
“Correlation does not imply causation” is a saying commonly heard in science and statistics emphasizing that a correlation between two variables does not necessarily imply that one variable causes the other. One example of this is the relationship between rain and umbrellas. People buy more umbrellas when it rains. This
My previous post pondered the term disestimation, coined by Charles Seife in his book Proofiness: How You’re Being Fooled by the Numbers to warn us about understating or ignoring the uncertainties surrounding a number, mistaking it for a fact instead of the error-prone estimate that it really is. Sometimes this fact appears to
My previous post explained how confirmation bias can prevent you from behaving like the natural data scientist you like to imagine you are by driving your decision making toward data that confirms your existing beliefs. This post tells the story of another cognitive bias that works against data science. Consider the following scenario: Company-wide
Data science, as Deepinder Dhingra recently blogged, “is essentially an intersection of math and technology skills.” Individuals with these skills have been labeled data scientists and organizations are competing to hire them. “But what organizations need,” Dhingra explained, “are individuals who, in addition to math and technology, can bring in
At the Journalism Interactive 2014 conference, Derek Willis spoke about interviewing data, his advice for becoming a data-driven journalist. “The bulk of the skills involved in interviewing people and interviewing data are actually pretty similar,” Willis explained. “We want to get to know it a little bit. We want to figure
My previous post made the point that it’s not a matter of whether it is good for you to use samples, but how good the sample you are using is. The comments on that post raised two different, and valid, perspectives about sampling. These viewpoints reflected two different use cases for data,
In my previous post, I discussed sampling error (i.e., when a randomly chosen sample doesn’t reflect the underlying population, aka margin of error) and sampling bias (i.e., when the sample isn’t randomly chosen at all), both of which big data advocates often claim can, and should, be overcome by using all the data. In this
In his recent Financial Times article, Tim Harford explained the big data that interests many companies is what we might call found data – the digital exhaust from our web searches, our status updates on social networks, our credit card purchases and our mobile devices pinging the nearest cellular or WiFi network.
As an unabashed lover of data, I am thrilled to be living and working in our increasingly data-constructed world. One new type of data analysis eliciting strong emotional reactions these days is the sentiment analysis of the directly digitized feedback from customers provided via their online reviews, emails, voicemails, text messages and social networking
We sometimes describe the potential of big data analytics as letting the data tell its story, casting the data scientist as storyteller. While the journalist has long been a newscaster, in recent years the term data-driven journalism has been adopted to describe the process of using big data analytics to
While big data is rife with potential, as Larry Greenemeier explained in his recent Scientific American blog post Why Big Data Isn’t Necessarily Better Data, context is often lacking when data is pulled from disparate sources, leading to questionable conclusions. His blog post examined the difficulties that Google Flu Trends
Teller, the normally silent half of the magician duo Penn & Teller, revealed some of magic’s secrets in a Smithsonian Magazine article about how magicians manipulate the human mind. Given the big data-fueled potential of data science to manipulate our decision-making, we should listen to what Teller has to tell
Were you a mother who listened to classical music during your pregnancy, or a parent who played classical music in your newborn baby’s nursery because you heard it stimulates creativity and improves intelligence? If so, do you know where this “classical music makes you smarter” idea came from? In 1993, a
In the era of big data, Kenneth Cukier and Viktor Mayer-Schonberger noted in their book Big Data: A Revolution That Will Transform How We Live, Work, and Think, “we are in the midst of a great infrastructure project that in some ways rivals those of the past, from the Roman aqueducts