Blogs

Monthly Archives: July, 2017

Wayne ThompsonJuly 25, 2017 1

Machine learning best practices: combining lots of models

This is the third post in my series of machine learning techniques and best practices. If you missed the earlier posts, read the first one now, or review the whole machine learning best practices series. Data scientists commonly use machine learning algorithms, such as gradient boosting and decision forests, that automatically build

English

Advanced Analytics | Machine Learning

two charts show under sampling and oversampling

Wayne ThompsonJuly 19, 2017 4

Machine learning best practices: detecting rare events

This is the second post in my series of machine learning best practices. If you missed it, read the first post, Machine learning best practices: the basics. As we go along, all ten tips will be archived at this machine learning best practices page. Machine learning commonly requires the use of

English

Advanced Analytics | Machine Learning

Wayne ThompsonJuly 12, 2017 11

Machine learning best practices: the basics

I started my training in machine learning at the University of Tennessee in the late 1980s. Of course, we didn’t call it machine learning then, and we didn’t call ourselves data scientists yet either. We used terms like statistics, analytics, data mining and data modeling. Regardless of what you call

English

Advanced Analytics

Hui LiJuly 6, 2017 1

How to use regularization to prevent model overfitting

When building models, data scientists and statisticians often talk about penalty, regularization and shrinkage. What do these terms mean and why are they important? According to Wikipedia, regularization "refers to a process of introducing additional information in order to solve an ill-posed problem or to prevent overfitting. This information usually

English