The celebrity of data: Taking data to the mainstream

[ce·leb·ri·ty], noun. the state of being well known Media exposure, good or bad, is the surest way to gain celebrity.  Just ask any child actor gone bad in Hollywood. They know. Lately data has been getting more than its fifteen minutes of fame. And good or bad, I think it’s awesome. We’re […]

Post a Comment

Errors, lies, and big data

My previous post pondered the term disestimation, coined by Charles Seife in his book Proofiness: How You’re Being Fooled by the Numbers to warn us about understating or ignoring the uncertainties surrounding a number, mistaking it for a fact instead of the error-prone estimate that it really is. Sometimes this fact appears to […]

Post a Comment

In defense of the indefensible

.@philsimon on those who minimize the importance of data.

Post a Comment

Facebook and the myth of big data perfection

@philsimon says that perfection is elusive.

Post a Comment

The Chicken Man versus the Data Scientist

In my previous post Sisyphus didn’t need a fitness tracker, I recommended that you only collect, measure and analyze big data if it helps you make a better decision or change your actions. Unfortunately, it’s difficult to know ahead of time which data will meet that criteria. We often, therefore, collect, measure and analyze […]

Post a Comment

Sisyphus didn’t need a fitness tracker

In his pithy style, Seth Godin’s recent blog post Analytics without action said more in 32 words than most posts say in 320 words or most white papers say in 3200 words. (For those counting along, my opening sentence alone used 32 words). Godin’s blog post, in its entirety, stated: “Don’t measure […]

Post a Comment

Bad data management in a two-letter word

Big data? What about the small stuff? In preparing for an upcoming business trip, I decided to rent a car on Enterprise.com. I could have sworn that I had registered on the site at some point, but I couldn't find my user name and password. Call it a senior moment. […]

Post a Comment

SAS high-performance capabilities with Hadoop YARN

For Hadoop to be successful as part of the modern data architecture, it needs to integrate with existing tools. This integration allows you to reuse existing resources (licenses and personnel) and is typically 60% of the evaluation criteria for integration of Hadoop into the data center. One of the most […]

Post a Comment

Share your cluster – How Apache Hadoop YARN helps SAS

Even though it sounds like something you hear on a Montessori school playground, this theme “Share your cluster” echoes across many modern Apache Hadoop deployments. Data architects are plotting to assemble all their big data in one system – something that is now achievable thanks to the economics of modern […]

Post a Comment

Data science versus narrative psychology

My previous post explained how confirmation bias can prevent you from behaving like the natural data scientist you like to imagine you are by driving your decision making toward data that confirms your existing beliefs. This post tells the story of another cognitive bias that works against data science. Consider the following scenario: Company-wide […]

Post a Comment