I tell many of my analytics students that they ought to consider careers in data science. After all, the supply and demand is wildly out of kilter. What's more, this trend shows no signs of abating.
It would be silly to contend, though, that data scientists are magicians. Even those who can magically obtain data, analyze it and make recommendations cannot overcome significant organizational impediments. Put differently, many organizations fail to use their data scientists properly or at least set them up for success.
In this post, I'll list a few of the largest mistakes organizations make vis-à-vis data scientists. And I'll look at how a data scientist should or could use data management for analytics – challenges/pitfalls to avoid, etc.
Asking poor questions
Consider the following innocuous query: Should a company launch a product or new services?
How would a data scientist answer that question? It wouldn't be easy because it's so vague.
Organizations would be best served by asking better questions. I'm talking here about focused queries – ones that data scientists can attempt to answer. They might include:
- What are the odds that a company can reach X% market share in six months? (Yes, I'm talking here about a distribution.)
- What are the sales of a product comparable to the one that the company is considering? Here the data scientist can use cluster analysis.
These more specific questions allow data scientists to really get to work. It's best to avoid big, unanswerable questions. Break unwieldy questions up into sub-questions. For more on this, see A More Beautiful Question: The Power of Inquiry to Spark Breakthrough Ideas.
Know when to stop asking why
Why do Walmart customers consistently buy strawberry Pop-Tarts prior to hurricanes?
No one seems to know and, ultimately, does it really matter? Sure, data scientists can spend weeks or months spinning their wheels, but aren't there bigger fish to fry? J.P. Eggers of NYU makes this point at a recent conference and I violently agree.Manage enterprise data better
Data scientists often come to the table armed with powerful data analysis tools such as Panda. Why not save them the time of wrangling data from disparate sources and creating master records? Those activities are quintessential hygiene factors: doing them doesn't really solve the problem, but ignoring them prohibits solving the problem.
Every moment that a data scientist spends pursuing, cleansing and combining data is a minute less spent on the good stuff: analysis and modeling.
Simon says: Help the data scientists help you.
We often forget the basics: Organizations exist to make decisions. Period. As I write in Analytics: The Agile Way, the practice of analytics merely represents a tool to make better ones. That is, analytics allows organizations to make better and more accurate decisions. Data – and data scientists, for that matter – are a means to an end. Foolish is the soul who believes that data scientists work in isolation. Why not make it easy for them to succeed?
Feedback
What say you?
Prepare data for analytics quickly – no coding required. Try SAS Data Preparation for free.