## Tag: linear regression

0
The likelihood ratio test for linear regression in SAS

A recent article describes how to estimate coefficients in a simple linear regression model by using maximum likelihood estimation (MLE). One of the nice properties of an MLE formulation is that you can compare a large model with a nested submodel in a natural way. For example, if you can

0
Maximum likelihood estimates for linear regression

A statistical analyst used the GENMOD procedure in SAS to fit a linear regression model. He noticed that the table of parameter estimates has an extra row (labeled "Scale") that is not a regression coefficient. The "scale parameter" is not part of the parameter estimates table produced by PROC REG

0
Partial leverage plots

For a linear regression model, a useful but underutilized diagnostic tool is the partial regression leverage plot. Also called the partial regression plot, this plot visualizes the parameter estimates table for the regression. For each effect in the model, you can visualize the following statistics: The estimate for each regression

0
Weights for residuals in robust regression

An early method for robust regression was iteratively reweighted least-squares regression (Huber, 1964). This is an iterative procedure in which each observation is assigned a weight. Initially, all weights are 1. The method fits a least-squares model to the weighted data and uses the size of the residuals to determine

0
Compare computational methods for least squares regression

In a previous article, I discussed various ways to solve a least-square linear regression model. I discussed the SWEEP operator (used by many SAS regression routines), the LU-based methods (SOLVE and INV in SAS/IML), and the QR decomposition (CALL QR in SAS/IML). Each method computes the estimates for the regression

0
The QR algorithm for least-squares regression

In computational statistics, there are often several ways to solve the same problem. For example, there are many ways to solve for the least-squares solution of a linear regression model. A SAS programmer recently mentioned that some open-source software uses the QR algorithm to solve least-squares regression problems and asked

0
Identify influential observations in regression models

A previous article discusses how to interpret regression diagnostic plots that are produced by SAS regression procedures such as PROC REG. In that article, two of the plots indicate influential observations and outliers. Intuitively, an observation is influential if its presence changes the parameter estimates for the regression by "more

0
An overview of regression diagnostic plots in SAS

When you fit a regression model, it is useful to check diagnostic plots to assess the quality of the fit. SAS, like most statistical software, makes it easy to generate regression diagnostics plots. Most SAS regression procedures support the PLOTS= option, which you can use to generate a panel of

0
Generate all quadratic interactions in a regression model

I've previously written about how to generate all pairwise interactions for a regression model in SAS. For a model that contains continuous effects, the easiest way is to use the EFFECT statement in PROC GLMSELECT to generate second-degree "polynomial effects." However, a SAS programmer was running a simulation study and

0
Decile plots in SAS

I previously showed how to create a decile calibration plot for a logistic regression model in SAS. A decile calibration plot (or "decile plot," for short) is used in some fields to visualize agreement between the data and a regression model. It can be used to diagnose an incorrectly specified

0
Visualize uncertainty in regression predictions

You've probably seen many graphs that are similar to the one at the right. This plot shows a regression line overlaid on a scatter plot of some data. Given a value for the independent variable (x), the regression line gives the best prediction for the mean of the response variable

0
Regression with inequality constraints on parameters

A previous article discussed how to solve regression problems in which the parameters are constrained to be a specified constant (such as B1 = 1) or are restricted to obey a linear equation such as B4 = –2*B2. In SAS, you can use the RESTRICT statement in PROC REG to

0
Restricted least squares regression in SAS

A data analyst recently asked a question about restricted least square regression in SAS. Recall that a restricted regression puts linear constraints on the coefficients in the model. Examples include forcing a coefficient to be 1 or forcing two coefficients to equal each other. Each of these problems can be

0
Visualize collinearity diagnostics

A previous article shows how to interpret the collinearity diagnostics that are produced by PROC REG in SAS. The process involves scanning down numbers in a table in order to find extreme values. This can be a tedious and error-prone process. Friendly and Kwan (2009) compare this task to a

0
Visualize residual projections for linear regression

A SAS programmer wanted to create a graph that illustrates how Deming regression differs from ordinary least squares regression. The main idea is shown in the panel of graphs below. The first graph shows the geometry of least squares regression when we regress Y onto X. ("Regress Y onto X"

0
Collinearity diagnostics: Should the data be centered?

In a previous article, I showed how to perform collinearity diagnostics in SAS by using the COLLIN option in the MODEL statement in PROC REG. For models that contain an intercept term, I noted that there has been considerable debate about whether the data vectors should be mean-centered prior to

0
Collinearity in regression: The COLLIN option in PROC REG

I was recently asked about how to interpret the output from the COLLIN (or COLLINOINT) option on the MODEL statement in PROC REG in SAS. The example in the documentation for PROC REG is correct but is somewhat terse regarding how to use the output to diagnose collinearity and how

0
Add loess smoothers to residual plots

When fitting a least squares regression model to data, it is often useful to create diagnostic plots of the residuals versus the explanatory variables. If the model fits the data well, the plots of the residuals should not display any patterns. Systematic patterns can indicate that you need to include

0
Leave-one-out statistics and a formula to update a matrix inverse

For linear regression models, there is a class of statistics that I call deletion diagnostics or leave-one-out statistics. These observation-wise statistics address the question, "If I delete the i_th observation and refit the model, what happens to the statistics for the model?" For example: The PRESS statistic is similar to

0
The Theil-Sen robust estimator for simple linear regression

Modern statistical software provides many options for computing robust statistics. For example, SAS can compute robust univariate statistics by using PROC UNIVARIATE, robust linear regression by using PROC ROBUSTREG, and robust multivariate statistics such as robust principal component analysis. Much of the research on robust regression was conducted in the

0
4 reasons to use PROC PLM for linear regression models in SAS

Have you ever run a regression model in SAS but later realize that you forgot to specify an important option or run some statistical test? Or maybe you intended to generate a graph that visualizes the model, but you forgot? Years ago, your only option was to modify your program

0
Deming regression for comparing different measurement methods

Deming regression (also called errors-in-variables regression) is a total regression method that fits a regression line when the measurements of both the explanatory variable (X) and the response variable (Y) are assumed to be subject to normally distributed errors. Recall that in ordinary least squares regression, the explanatory variable (X)