What’s the most important component of analytic analysis? The data? The model? The deployment? Getting the business problem right? All the above? Or does it simply depend on who you ask? While the model gets all the attention, and the data requires most of the effort, there is that step
Uncategorized
We have updated our software for improved interpretability since this post was written. For the latest on this topic, read our new series on model-agnostic interpretability. While some machine learning models – like decision trees – are transparent, the majority of models used today – like deep neural networks, random forests, gradient boosting
There are four widely recognized styles of machine learning: supervised, unsupervised, semi-supervised and reinforcement learning. These styles have been discussed in great depth in the literature and are included in most introductory lectures on machine learning algorithms. As a recap, the table below summarizes these styles. For a comprehensive mapping
Deep learning has taken off because organizations of all sizes are capturing a greater variety of data and can mine bigger data, including unstructured data. It’s not just large companies like Amazon, SAS and Google that have access to big data. It’s everywhere. Deep learning needs big data, and now
In machine learning, a feature is another word for an attribute or input, or an independent variable. What is feature engineering? Feature engineering is a process of preparing inputs for machine learning models. The goal of feature engineering is to to improve classification accuracy by considering the limitations of the
Several weeks ago, I wrote about practical advice from a Chief Data Scientist in my blog “From Aristotle to Pi: Practical advice from a chief data scientist.” Now I want to offer my advice as a newbie trying to navigate through machine learning concepts and how to code them. Over
Did you know that SAS has two on-site solar farms? At a combined 2.3 MW in capacity, SAS’ solar farms are located on 12 acres at world headquarters in Cary, NC. The photovoltaic (PV) solar arrays generate 3.8 million kilowatt-hours of clean, renewable energy each year, reducing carbon dioxide emissions
We have updated our software for improved interpretability since this post was written. For the latest on this topic, read our new series on model-agnostic interpretability. Assessing a model`s accuracy usually is not enough for a data scientist who wants to know more about how a model is working. Often
Sequence models, especially recurrent neural network (RNN) and similar variants, have gained tremendous popularity over the last few years because of their unparalleled ability to handle unstructured sequential data. The reason these models are called “recurrent” is that they work with data that occurs in a sequence, such as text
What can you learn from a chief data scientist who's worked in analytics for for 25 years and has been involved in the development of many key SAS solutions, including SAS Enterprise Miner? As a veteran of the analytics industry, Wayne Thompson has witnessed the evolution of machine learning and