This is the fifth post in my series of machine learning best practices. Hyperparameters are the algorithm options one "turns and tunes" when building a learning model. Hyperparameters cannot be learned using that algorithm. So, these parameters need to be assigned before training of the model. A lot of manual
Uncategorized
This is the fourth post in my series of 10 machine learning best practices. It’s common to build models on historical training data and then apply the model to new data to make decisions. This process is called model deployment or scoring. I often hear data scientists say, “It took
This is the third post in my series of machine learning techniques and best practices. If you missed the earlier posts, read the first one now, or review the whole machine learning best practices series. Data scientists commonly use machine learning algorithms, such as gradient boosting and decision forests, that automatically build
This is the second post in my series of machine learning best practices. If you missed it, read the first post, Machine learning best practices: the basics. As we go along, all ten tips will be archived at this machine learning best practices page. Machine learning commonly requires the use of
I started my training in machine learning at the University of Tennessee in the late 1980s. Of course, we didn’t call it machine learning then, and we didn’t call ourselves data scientists yet either. We used terms like statistics, analytics, data mining and data modeling. Regardless of what you call
When building models, data scientists and statisticians often talk about penalty, regularization and shrinkage. What do these terms mean and why are they important? According to Wikipedia, regularization "refers to a process of introducing additional information in order to solve an ill-posed problem or to prevent overfitting. This information usually
Ensemble methods are commonly used to boost predictive accuracy by combining the predictions of multiple machine learning models. The traditional wisdom has been to combine so-called “weak” learners. However, a more modern approach is to create an ensemble of a well-chosen collection of strong yet diverse models. Building powerful ensemble models
A previous post, Spatial econometric modeling using PROC SPATIALREG, introduced the SAS/ETS® SPATIALREG procedure and demonstrated its usage to fit both linear and SAR models by using 2013 county-level home value data in North Carolina. In most analysis for spatial econometrics, you rarely know the true model from which your data
I am often asked to describe my career as a woman in analytics and provide some insights to guide women who wish to be part of this field and to succeed as leaders in the profession. I have divided my comments on women in analytics into sections, starting from the beginning,
I recently met Mrs. Claus at the INFORMS Annual Meeting, where we got to talking about the social network analysis session she’d just attended. It turns out Mrs. Claus and I are both fans of a book by Alex Pentland, Social Physics: How Social Networks Can Make Us Smarter. Apparently