The DO Loop

Analytics | Data Visualization | Learn SAS

Rick WicklinJune 19, 2019 1

Influential observations in a linear regression model: The DFFITS and Cook's D statistics

A previous article describes the DFBETAS statistics for detecting influential observations, where "influential" means that if you delete the observation and refit the model, the estimates for the regression coefficients change substantially. Of course, there are other statistics that you could use to measure influence. Two popular ones are the

English

Analytics | Data Visualization | Learn SAS

Rick WicklinJune 17, 2019 11

Influential observations in a linear regression model: The DFBETAS statistics

My article about deletion diagnostics investigated how influential an observation is to a least squares regression model. In other words, if you delete the i_th observation and refit the model, what happens to the statistics for the model? SAS regression procedures provide many tables and graphs that enable you to

English

Advanced Analytics | Programming Tips

Rick WicklinJune 12, 2019 5

Leave-one-out statistics and a formula to update a matrix inverse

For linear regression models, there is a class of statistics that I call deletion diagnostics or leave-one-out statistics. These observation-wise statistics address the question, "If I delete the i_th observation and refit the model, what happens to the statistics for the model?" For example: The PRESS statistic is similar to

English

Learn SAS | Programming Tips

Rick WicklinJune 10, 2019 16

5 reasons to use PROC FORMAT to recode variables in SAS

Recoding variables can be tedious, but it is often a necessary part of data analysis. Almost every SAS programmer has written a DATA step that uses IF-THEN/ELSE logic or the SELECT-WHEN statements to recode variables. Although creating a new variable is effective, it is also inefficient because you have to

English

Data Visualization | Learn SAS | Programming Tips

Rick WicklinJune 5, 2019 2

Plot a family of curves in SAS

A family of curves is generated by an equation that has one or more parameters. To visualize the family, you might want to display a graph that overlays four of five curves that have different parameter values, as shown to the right. The graph shows members of a family of

English

Data Visualization | Learn SAS | Programming Tips

Rick WicklinJune 3, 2019 4

Graph wide data and long data in SAS

Statistical programmers and analysts often use two kinds of rectangular data sets, popularly known as wide data and long data. Some analytical procedures require that the data be in wide form; others require long form. (The "long format" is sometimes called "narrow" or "tall" data.) Fortunately, the statistical graphics procedures

English

Blogs

Blogs

The DO Loop