What can you learn from a chief data scientist who's worked in analytics for for 25 years and has been involved in the development of many key SAS solutions, including SAS Enterprise Miner? As a veteran of the analytics industry, Wayne Thompson has witnessed the evolution of machine learning and now deep learning into artificial intelligence.
In a recent interview, Thompson touched on a wide variety of topics and dished out practical advice. You can hear his interview on The AI Podcast episode “The Long View on Big Data,” hosted by Noah Kravitz.
Tips for machine learning beginners
In the podcast, Thompson offers advice for companies that are trying to get started with machine learning and deep learning. Consider these three tips
- Start with a good fundamental business problem. One in which there is data and lots of it! How much data do you need? Well, Thompson says “You need a bunch of data to do deep learning, you need even more data to do unsupervised learning and when you get to reinforcement learning, you better be working with major, major firms that have lots of transactional history about their clients.”
- Focus on supervised learning, whereby the data is labeled. Supervised learning includes widely used machine learning algorithms such as decision trees and logistic regression. This allows you to demonstrate the value of your projects quickly.
- Get the technology stack right. Invest in tools that can grow with you. Once you start, you’re going to get more data and more users. Whatever platform you choose, it should be able to accommodate open source and extend and tie into operations.
Tips for data scientists
For those of you who are entering data science as a profession, Thompson recommends that you get out that Raspberry Pi and start building data products. You also have to learn the analytics. These two must be tied together.
There’s also a shout out to Aristotle. Listen to the podcast and you’ll hear why Aristotle is mentioned. Here’s a hint: long live empiricism!
You can also check out Thompson’s bog post about big data, big models and big computation, which is referred to in this podcast, or you can check out his popular series: 10 machine learning best practices.