I read an interesting article recently that suggested analyst and data scientist job positions may be on the way out. The author argued that analytics are being incorporated more and more heavily into operational systems, making “analytic capabilities” more readily accessible to business users without the involvement of a data scientist.
Being a data scientist and a manager of an analytics team, this insinuation definitely gave me pause.
It is true that operational systems, in an effort grow their business and stay competitive, are continuing to focus on added built-in analytics for their solutions. Honestly, it’s been a while since I’ve come across an operational system that doesn’t offer some form of data visualization or dashboard capabilities.
For example, Workday announced in late 2014 that it would be introducing analytics tools into its ERP applications for HR and Finance, which will provide business users with built-in predictive analytics models. Even SAS has also developed a product called Rapid Predictive Modeler® that allows non-data scientists to quickly build analytics models.
What does all this increase in automatic analytics technology mean for today’s statisticians, analysts and data scientists? At first glance, it can feel intimidating; however, all these advances can actually be a good thing for the data science field.
Holistically, one of the missions of an analytics team is to foster and develop a culture of analytics within the organization. With all the focus on analytics by operational vendors, executing on that mission is getting easier every day!
So, where does the data scientist fit in? Will you still have a need for data scientists? Over time the data scientists job will continue to evolve.
Depending on the level of analytics used within a company, I've noticed business stakeholders will often request help from data scientists for data exploration (i.e. data visualization) as well as advanced analytics. With the increase in built-in data visualization and analytical model prototypes offered by operational systems, data scientists will likely continue to shift more heavily into advanced analytics.
Why you still need data scientists
If you're looking for more concrete reasons why data scientists will continue to be in high demand, below are some additional reasons I think companies still need data scientists:
- Enhancing built-in analytics. Built-in analytical models are a great start – meaning they are light years ahead and more powerful than not having anything at all. However, as we data scientists know, we can always do better. Applying more advanced statistics can help enhance these models to increase their accuracy. I have seen quick, low-touch analytical methods provide decent levels of accuracy (which means what you predicted came true within a decent amount of the time), but really fine tuning and providing more predictive accuracy requires higher statistical skills and a deeper interaction with the data.
- Integrating and connecting data from disparate systems. Built-in analytics only work with the data in that operational system. Again, this is a good start and is better than nothing. However, predictive analytics often requires data that is spread throughout a company in various systems and even in various formats (some structured and others non-structured). For example, if we try to predict what causes workers to quit their jobs, looking at their worker data in the HR system is a great place to start. But as we all know, the reason a worker chooses to leave a company may not have anything to do with the data that is stored in their HR record. To truly assess how your company can increase it's worker retention rates, a separate analytics engagement would be needed outside the HR system.
- Staying ahead of the competition. The drive to stay competitive will continue to push companies to move beyond the basics offered by these out-the-box analytical capabilities. Companies are always looking for ways to stay relevant and to get ahead of the competition. As nearly everyone takes advantage of the standard analytics capabilities in these operational systems, companies will need to look for something more advanced to stay ahead of their competitors.
Overall, as data scientists and advocates for analytics, we should embrace and be thankful for the continued focus on analytics within the industry and also know that our jobs are here to stay!
Learn more about data scientists – and read profiles of other data scientists – in our new data scientist series.
Jennifer, I completely agree with you! Sadly I hear the same argument as well, mainly from people that want to sell data mining tools. I would argue great tools just make Data Scientists all the more valuable instead of displacing them.
Truth is, analytics is an intellectually challenging knowledge discovery process. To solve real world problems, we need to have domain expertise, frame the problem correctly, gather relevant data, use the right approach, do the data mining via powerful tools, interpret the answers, and then translate the numerical outcome into domain insights and actions. Running data through algorithms, while important, is just part of the whole process, real world problem solving is so much more than "just crunching numbers".
Tools can not decide what problems to solve or how to solve it, it has no domain knowledge, it doesn't know what data is relevant, it can't tell bad data from good data, it can't discern whether the output is insightful or not, and it doesn't know what to do to improve results. I find it quite incredulous that people would suggest these algorithms can displace people when they should be saying the exact opposite - that it makes people more powerful. They are portraying the wrong value proposition. Yes, in the most controlled environment on previous well defined problems (by expert Data Scientists), automated discovery can work in that sandbox, it is better than nothing, it will likely deliver decent results, but it is nothing compared to experts at work. Instead, I see more powerful algorithms allowing me to test more problem formulations rapidly, it allows me to analyze deeper, wider, faster; it makes me more capable and more powerful. That is priceless.
The other bad message I hear is giving analytics tools to people without training - the "let your lowest paid employee do analytics during coffee break" approach, because, you know, analytics is so easy, just click a few buttons and get your answers! People without training don't know about Simpson's Paradox; they don't know what the tools are signalling to them; they don't know what missing values are or what to do; they don't know the difference between correlation and causality, the list goes on and on. To these people, I say, be careful of monkey with blades, it's going to hurt a lot. Analytics should not be carried out by people without the proper education or training.
World class problem solving requires the synergy of brilliant minds assisted by a vast array of great computational tools, clean and relevant data, and a solid analytical process, all shaped by domain knowledge and experience. Nothing less will do. To suggest anything less is just misguided and disingenuous.
Just thoughts from a four decade long and still practising Data Scientist.
Hey Daymond -
Thanks so much for your comments and response to this blog entry. It's always great to hear from the experiences of other data scientists! Your points about domain expertise are so spot on. I was talking with some of the students of the Institute for Advanced Analytics at NC State recently, and one of the points that was emphasized was the importance of developing that business domain expertise. I'd also like to echo your comment regarding the value in training users without an analytics background prior to their utilization of an analytics tool. That training in my experience helps sets expectations, eliminates common misconceptions, and sets them off on a path for success.
Thanks again for your comment!