The DO Loop
Statistical programming in SAS with an emphasis on SAS/IML programs![Visualize race times in SAS](https://blogs.sas.com/content/iml/files/2019/06/runners1-640x336.png)
Math and statistics are everywhere, and I always rejoice when I spot a rather sophisticated statistical idea "in the wild." For example, I am always pleased when I see a graph that shows the distribution of race times in a typical race (such as a 5K), as shown to the
![Jump-start PROC LOGISTIC by using parameter estimates from PROC HPLOGISTIC](https://blogs.sas.com/content/iml/files/2019/06/hplogistic1-377x336.png)
SAS/STAT software contains a number of so-called HP procedures for training and evaluating predictive models. ("HP" stands for "high performance.") A popular HP procedure is HPLOGISTIC, which enables you to fit logistic models on Big Data. A goal of the HP procedures is to fit models quickly. Inferential statistics such
![Add loess smoothers to residual plots](https://blogs.sas.com/content/iml/files/2019/06/residsmooth1-640x336.png)
When fitting a least squares regression model to data, it is often useful to create diagnostic plots of the residuals versus the explanatory variables. If the model fits the data well, the plots of the residuals should not display any patterns. Systematic patterns can indicate that you need to include
![Influential observations in a linear regression model: The DFFITS and Cook's D statistics](https://blogs.sas.com/content/iml/files/2019/06/influencecooksd1-640x336.png)
A previous article describes the DFBETAS statistics for detecting influential observations, where "influential" means that if you delete the observation and refit the model, the estimates for the regression coefficients change substantially. Of course, there are other statistics that you could use to measure influence. Two popular ones are the
![Influential observations in a linear regression model: The DFBETAS statistics](https://blogs.sas.com/content/iml/files/2019/06/influencedfbetas3-640x336.png)
My article about deletion diagnostics investigated how influential an observation is to a least squares regression model. In other words, if you delete the i_th observation and refit the model, what happens to the statistics for the model? SAS regression procedures provide many tables and graphs that enable you to
![Leave-one-out statistics and a formula to update a matrix inverse](https://blogs.sas.com/content/iml/files/2019/06/ShermanMorrison2.png)
For linear regression models, there is a class of statistics that I call deletion diagnostics or leave-one-out statistics. These observation-wise statistics address the question, "If I delete the i_th observation and refit the model, what happens to the statistics for the model?" For example: The PRESS statistic is similar to