How to use PROC SGPLOT to display the slope and intercept of a regression line

38

A SAS user asked an interesting question on the SAS/GRAPH and ODS Graphics Support Forum. The question is: Does PROC SGPLOT support a way to display the slope of the regression line that is computed by the REG statement? Recall that the REG statement in PROC SGPLOT fits and displays a line through points in a scatter plot.

In SAS 9.3, you cannot obtain this information directly from PROC SGPLOT. Instead, you need to use PROC REG to compute this information. You can use the following steps to create a plot that displays the parameter estimates:

  1. Use PROC REG to compute the parameter estimates (slope and intercept). Save this information to a SAS data set.
  2. Use a DATA step to create macro variables that contain the parameter estimates.
  3. Use the INSET statement in PROC SGPLOT to add this information to the fitted scatter plot\.

Step 1: Save the parameter estimates

You can use the OUTEST= option or the ODS OUPUT statements to save the parameter estimates to a SAS data set. In the following example, the ODS OUTPUT statement saves the ParameterEstimates table to the PE data set:

ods graphics off;
proc reg data=sashelp.class;
   model weight = height;
   ods output ParameterEstimates=PE;
run;

Step 2: Create macro variables

In the PE data set, the ESTIMATE variable contains the parameter estimates. The first row contains the estimate for the intercept term; the second row contains the estimate for the slope. The following DATA step saves these into macro variables:

data _null_;
   set PE;
   if _n_ = 1 then call symput('Int', put(estimate, BEST6.));    
   else            call symput('Slope', put(estimate, BEST6.));  
run;

Step 3: Use the INSET Statement to display the parameter estimates

You can now create the plot by using PROC SGPLOT. Use the INSET statement to display the parameter estimates in a text box:

proc sgplot data=sashelp.class noautolegend;
   title "Regression Line with Slope and Intercept";
   reg y=weight x=height;
   inset "Intercept = &Int" "Slope = &Slope" / 
         border title="Parameter Estimates" position=topleft;
run;

Of course, you can use a similar strategy to display any other relevant statistics on the scatter plot. There is an example in the SAS/STAT User's Guide that shows other fit statistics, as well as how to use Greek letters and superscripts in the inset text. You can also display a formula that shows the equation of a displayed line.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

38 Comments

  1. Richard Montes on

    I use sgplot using reg but with group option to perform an ANCOVA and ancova plot is generated. Is there a way to plot additional data series onto the ancovaplot (just like in gplot overlay). The additional data series is not part of the regression analysis from which confidence bands are generated. Thanks in advance.

    • Rick Wicklin

      I think you are saying that you use the SGPLOT procedure and use the REG statement with the GROUP= option to get several regression fits overlaid on the same plot. If that is so, the answer is yes. In the same procedure you can add one or more SERIES statements that display (precomputed) curves. The only "trick" is that the additional series need to use variables in the same data set as the regression data. For example, if the MYDATA data set contains the 500 observations for the regression, you might need to merge that data with the precomputed curves in the MYCURVES data set, which might have fewer points, like this:
      data A;
      set MYDATA MYCURVES;
      run;
      proc sgplot data=A;
      reg x=x y=y / group=group;
      series x=curveX y=curveY;
      run;

      • I posted this question in the SAS community but haven't gotten any response so I thought I try it here:

        The ANCOVAPLOT generated using PROC GLM is very useful. I need to show intersection of a reference line in Y axis with the CLM (confidence interval). Is there a way to draw such reference line on the Y axis? Addtionally, can the axis be modified (scale, min-max, etc)?

        The PROC REG method Rick outline in the prior thread would take care of generating the reference line. However, if I have a categorical variable that I am grouping by in the model, the confidence interval that would be constructed would use the individual RMSE's in fitting regression by group. Whereas, in PROC GLM it uses the pooled RMSE depending on the model, thus I prefer to use PROC GLM for generating ancovaplot. But from what I read in SAS literature, it does not offer any option for adding reference lines, scaling, etc.

        Thanks in advance for the help.
        -Richard

      • I am trying to plot just one line that includes a linear regression with its covariates (up to three). I see how you can make multiple lines, but can you make just one of the entire adjusted model?

  2. Richard Montes on

    Rick, Can I ask another plotting question. I would have a regression model from a DOE study, for example:
    Y = A B A*B
    I need to display plot of Y as it varies with the explored range of factor A (along with predicted interval). However since factor B also influences Y, but graphing is limited to two-dimension, I would have to fix the value of B, say the upper level of the explored range (+1). Is there a way to generate such kind of plot using either the ODS graphics in PROC GLM or in PROC SGPLOT. I am not sure if the reg option in PROC SGPLOT can handle beyond a multiple regressor model, possibly, with interactions. Thank you very much.

  3. Hi,

    I want to graph rate of change (slope) of x1 but I am working with multiple linear regression and data is weighted so I am using Proc Surveyreg. Is there any way to use this procedure or something similar?

    Regards,
    Russell

  4. Pingback: 13 popular articles from 2013 - The DO Loop

  5. I am not able to run this code. I get a warning at step 2 that variable PE is uninitialized. I have checked my spelling and removed all null values from the dataset. Any suggestions? Thanks.

    • Rick Wicklin

      Here is the program in one block so that you don't need to copy/paste three separate segments:

      ods graphics off;
      proc reg data=sashelp.class;
         model weight = height;
         ods output ParameterEstimates=PE;
      run;
      data _null_;
         set PE;
         if _n_ = 1 then call symput('Int', put(estimate, BEST6.));    
         else            call symput('Slope', put(estimate, BEST6.));  
      run;
      proc sgplot data=sashelp.class noautolegend;
         title "Regression Line with Slope and Intercept";
         reg y=weight x=height;
         inset "Intercept = &Int" "Slope = &Slope" / 
               border title="Parameter Estimates" position=topleft;
      run;
  6. Hi Rick,
    Thanks for the very helpful article. Do you know if SAS 9.4 can display the regression line created by the REG statement in SGPLOT?
    Thank you!

    • Rick Wicklin

      Yes, the REG statement in PROC SGPLOT displays a scatter plot with a regression line. If you just want the line, use the NOMARKERS options. You can even use DEGREE=2 or DEGREE=3 to compute polynomial smoothers.

      • Thanks so much for your reply! I didn't word my question very well -- what I meant to say is, is it possible to see the actual parameter estimates (slope and intercept) that SAS computes and that correspond to regression line in PROC SGPLOT?
        Thanks again!

  7. Hi Rick,

    How can I plot multiple regression lines ? What I mean is, I ran several PROC REG's, and saved the intercept and slope. Then I merged them into a single dataset. I want to plot, let's say, 3 lines in one scatter. Is is possible ?

  8. Hi Rick,
    Thanks for the informative article. How can we plot regression line during the linear phase of the concentration vs. time data to estimate the terminal disposition rate constant (lambda) using best fit method?

  9. Rick,

    Thank you for the demo! This is so helpful. I needed to quickly test correlation between a bunch of variables yesterday. So I saved the statistics from PROC CORR and used the SGPLOT code you provided to display the correlation coefficient and p-value on the scatter plot. It worked like a charm!

  10. Is there a way to get the slope coefficient for each group separately?
    For example I have the current code:

    PROC sGPLOT DATA = &data (where =( &group ne .)) noautolegend dattrmap = attrmap ;
    reg x = &x y = &y /group = &group clm clmtransparency = 0.4 attrid = &group;
    title "&title";
    inset "Intercept = &Int" "Slope = &Slope" /
    border title="Parameter Estimates" position=topleft;
    run;

    I used the code you mentioned above, this only generates the intercept and slope for the first group listed. Is there any way to generate the slope for each group, (in my case there are 3 groups)?

  11. I was trying to use your code to label the regression lines but I keep on getting the error that &Int and &Slope cannot be resolved.

    %DO I=1 %TO 2;
    PROC SGPLOT DATA=MODDATA.ZIP_WO_&STRT_MTH._&END_MTH.;
    SCATTER X = %SCAN(&VAR1,&I) Y = S_POST / MARKERATTRS=(COLOR=ORANGE) TRANSPARENCY=0.5;
    REG X = %SCAN(&VAR1,&I) Y = S_POST / NOMARKERS;
    TITLE 'S_POST vs '%SCAN(&VAR1,&I);
    inset "Intercept = &INT" "Slope = &Slope" / border title="Parameter Estimates" position=topleft;
    RUN;
    %END;

    • Rick Wicklin

      It sounds like you didn't define the macro variables. They are defined by the SYMPUT statement in the preceding DATA step. For your example, you might want to define Int1, Int2, Slope1, and Slope2. If you have questions, post to the communities.sas.com.

  12. Hi Rick, I am the regular follower of your SAS blog, and I think your blog helps us a lot especially in how to make nice graphs.

    I found another easier way to display the slope and intercept of a regression line in sgplot procedure. That is, use the combination of scatter and reg statement in sgplot procedure.
    Here is the blog link:
    https://blogs.sas.com/content/graphicallyspeaking/2018/02/21/getting-started-sgplot-part-10-regression-plot/

    proc sgplot data=sashelp.class;
    scatter y=weight x=height / group=sex ;
    reg y=weight x=height ;
    run;

    I simulate his example and do SAS coding in my own data set, and I found that I successfully produced 4 symbols for group and only 1 regression line in my plot, which can not be achieved when only using reg statement( when only using reg statement, it will produce 4 regression lines of my data set).Here is my coding:

    Proc sgplot data=;
    by week;
    styleattrs datacontrastcolors=(black)
    datasymbols=(circle circlefilled square squarefilled);
    scatter x=Fdintsowandpiglet y=resid/group=piggroup;
    reg x=Fdintsowandpiglet y=resid;
    run;

    Regards,
    Tianyue

  13. Hi
    I would possibly like to fit my data into a linear plus exponential curve. the current regression line that SAS gives me is a quadratic line. For agricultural purposes it is suggested Linear+ exponential is the best option.
    How can I access to the commands for this purpose?

Leave A Reply

Back to Top