The DO Loop
Statistical programming in SAS with an emphasis on SAS/IML programsdata:image/s3,"s3://crabby-images/6d87d/6d87d69c3961cbd5e1f8780c42921a2323a89cb4" alt="3 ways to obtain the Hessian at the MLE solution for a regression model"
When you use maximum likelihood estimation (MLE) to find the parameter estimates in a generalized linear regression model, the Hessian matrix at the optimal solution is very important. The Hessian matrix indicates the local shape of the log-likelihood surface near the optimal value. You can use the Hessian to estimate
data:image/s3,"s3://crabby-images/dc037/dc0373727520efd7c79ba0d6ef955f8dc6a98ba9" alt="4 reasons to use PROC PLM for linear regression models in SAS"
Have you ever run a regression model in SAS but later realize that you forgot to specify an important option or run some statistical test? Or maybe you intended to generate a graph that visualizes the model, but you forgot? Years ago, your only option was to modify your program
data:image/s3,"s3://crabby-images/45009/45009242a523ef9264d487f8740a74aabfc37cab" alt="Feature generation and correlations among features in machine learning"
Feature generation (also known as feature creation) is the process of creating new features to use for training machine learning models. This article focuses on regression models. The new features (which statisticians call variables) are typically nonlinear transformations of existing variables or combinations of two or more existing variables. This
data:image/s3,"s3://crabby-images/c01e5/c01e5fc200750d5957d48ef6eb5571f9e6a068df" alt="Model selection with PROC GLMSELECT"
I previously discussed how you can use validation data to choose between a set of competing regression models. In that article, I manually evaluated seven models for a continuous response on the training data and manually chose the model that gave the best predictions for the validation data. Fortunately, SAS
data:image/s3,"s3://crabby-images/e0d4f/e0d4f8765933ce9957ea8268dcd8cded607967d4" alt="Model assessment and selection in machine learning"
Machine learning differs from classical statistics in the way it assesses and compares competing models. In classical statistics, you use all the data to fit each model. You choose between models by using a statistic (such as AIC, AICC, SBC, ...) that measures both the goodness of fit and the
data:image/s3,"s3://crabby-images/ba285/ba2855bb3689944fdae2b152604675e96a6a37a2" alt="Simulate data for a regression model with categorical and continuous variables Parameter estimates for synthetic (simulated) data that follows a regression model."
This article shows how to use SAS to simulate data that fits a linear regression model that has categorical regressors (also called explanatory or CLASS variables). Simulating data is a useful skill for both researchers and statistical programmers. You can use simulation for answering research questions, but you can also