Download TDWI Best Practices Report: BI, Analytics and the Cloud - Strategies for Business Agility
In 1901, Gottlieb Daimler predicted: “The global demand for motor vehicles will not exceed one million—simply because of the lack of available chauffeurs”. Today, most people drive themselves and self-propelled cars are rapidly becoming a reality. Are we likely to see the same situation for data scientists, with more and more self-service analytics and machine learning? In making any predictions, it is always a good idea to ask the “chauffeur question”.
Daimler could hardly have imagined in his wildest dreams that there would be more than one billion vehicles registered worldwide. It never occurred to him that we would overcome the technical issue of needing a chauffeur to act as driver, mechanic and navigator. He recognized that the number of chauffeurs was the limiting factor, but did not consider that there might be more than one way to overcome that problem.
Data scientist as limiting factor
More than 100 years later, we may be seeing Chauffeur 2.0 - the data scientist. According to the Harvard Business Review, data scientist is now the “Sexiest Job of the 21st Century”. Because of limited human availability, data scientists are both expensive and in demand. Small and medium-sized enterprises, possibly in unattractive locations outside big cities, are finding it hard to recruit good staff, because the stars of data analysis want to work for Google rather than Müller GmbH. Does this mean that data science and advanced analytics are only for big players and not for small and medium-sized companies? Not any more, with the rise of approachable analytics.
Approachable analytics: starting to drive yourself
Approachable analytics is the idea of using analytical methods without programming skills in a simple and visual way. Users are given access to easy-to-use solutions for classic business intelligence problems (basic counting and weighing options) as well as high-quality analysis (for example, regressions, decision trees, and cluster analyses), to enable them to enter the data science arena.
An approachable analytics tool should have:
Analysis and reporting functions within one interface
It should be possible to interactively investigate data, derive new variables and measures, identify outliers, apply analytical methods, and create predictive models.
Ability to share information
All findings should be sharable with others, on any device, including mobile.
Data management
Users must be able to access data independently, from several different sources, and link them to carry out calculations. This means that systems should link to both internal (such as ERP and CRM) and external (Web, Twitter, Google Analytics) systems. The best tools include modern data wrangling functions.
Scalable infrastructure
Will data scientists be replaced, like chauffeurs? Looking at the current technical possibilities, the answer is a clear “maybe”. The available tools still have major functional differences. Not all analytics tools are created equal, with many using open source tools like R and Python. Analytic methods and predictive models require accurate interpretation and high quality data and analytics, or modeling errors will creep in. But this new generation of analytical tools are ideal entry points for more (big) data analytics. Savvy professionals will simply go one step further. Of course you need some training, but the access to analytics is tangible and feasible. You just have to want to start. The chauffeur question—whether limiting factors are given or not—simply adds another dimension.
My forecast
This sounds like a lot to ask. But if you want to drive yourself, you need a certain level of comfort and do not want to stop every 500 meters, because something in the engine has broken.
The environment should be able to grow as demand increases, without generating large expenses. It would also be practical to have various cloud variants available, such as private cloud. It also needs to have suitable security and user administration.