Let PROC FREQ create graphs of your two-way tables

13

The recent releases of SAS 9.4 have featured major enhancements to the ODS statistical graphics procedures such as PROC SGPLOT. In fact, PROC SGPLOT (and the underlying Graph Template Language (GTL)) are so versatile and powerful that you might forget to consider whether you can create a graph automatically by using a SAS statistical procedure. For example, when you turn on ODS graphics (ODS GRAPHICS ON), SAS procedures create the following graphs automatically:

  1. Many SAS regression procedures (and PROC PLM) create effect plots.
  2. PROC SURVEYREG creates a hexagonal bin plot.
  3. PROC REG creates heat maps when a scatter plot would suffer from overplotting.
  4. PROC LOGISTIC creates odds-ratio plots.

Let PROC FREQ visualize your two-way tables

Recently a SAS customer asked how to use PROC SGPLOT to produce a stacked bar chart. I showed him how to do it, but I also mentioned that the same graph could be produced with less effort by using PROC FREQ.

PROC FREQ is a workhorse procedure that can create dozens of graphs. For example, PROC FREQ can create a mosaic plot and a clustered bar chart to visualize frequencies and relative frequencies in a two-way table. Next time you use PROC FREQ, add the PLOTS=ALL option to the TABLES statement and see what you get!

One of my favorite plots for two-way categorical data is the stacked bar chart. As I told the SAS customer, it is simple to create a stacked bar chart in PROC FREQ. For example, the following statements create a horizontal bar chart that orders the categories by frequency:

proc freq data=sashelp.Heart order=freq;
   tables weight_status*smoking_status / 
       plots=freqplot(twoway=stacked orient=horizontal);
run;
Stacked bar chart created by PROC FREQ

See the documentation for the PLOTS= option in the TABLES statement for a description of all the plots that PROC FREQ can create. PROC FREQ creates many plots that are associated with a particular analysis, such as the "deviation plot," which shows the relative deviations between the observed and expected counts when you request a chi-square analysis of a one-way table.

Do you need a highly customized graph?

The ODS graphics in SAS are designed to create many—but not all—of the visualizations that are relevant to an analysis. Some attributes of a graph (for example, the title and the legend placement) are determined by a stored template and can't be modified by using the procedure syntax. Advanced GTL gurus might want to learn how to edit the ODS templates. Less ambitious users might choose to use a statistical procedure to automatically create graphs during data exploration and modeling, but then use PROC SGPLOT to create the final graph for a report.

For example, the following call to PROC FREQ writes the two-way frequency counts to a SAS data set. From the data you can create graphs that are similar to the one that PROC FREQ creates, but you can change the order and colors of the bars, alter the placement of the legend, add text, and more. The following call to PROC SGPLOT shows one possibility. Click on the graph to see the full-size version.

proc freq data=sashelp.heart order=freq noprint;
   tables smoking_status*weight_status / out=FreqOut(where=(percent^=.));
run;
 
ods graphics /height=500px width=800px;
title "Counts of Weight Categories by Smoking Status";
proc sgplot data=FreqOut;
  hbarparm category=smoking_status response=count / group=weight_status  
      seglabel seglabelfitpolicy=none seglabelattrs=(weight=bold);
  keylegend / opaque across=1 position=bottomright location=inside;
  xaxis grid;
  yaxis labelpos=top;
run;
Customized stacked bar chart created by PROC SGPLOT, using the output from PROC FREQ

Conclusions

If you want to create a stacked bar chart or some other visualization of a two-way table, you might be tempted to immediately start using PROC SGPLOT. The purpose of this article is to remind you that SAS statistical procedures, including PROC FREQ, often create graphs as part of their output. Even if a statistical procedure cannot provide all the bells and whistles of PROC SGPLOT, it often is a convenient way to visualize your data during the preliminary stages of an analysis. If you need a highly customized graph for a final report, the SAS procedure can output the data for the graph. You can then use PROC SGPLOT or the GTL to create a customized graph.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

13 Comments

  1. Arlen Harmoning on

    PROC FREQ has been a favorite of mine for a long time, though I've never used the PLOTS= option. Thanks for bringing that to our attention.

  2. Nice presentation. I want to know if it is possible to change the orientation to vertical. I mean is it possible to include an 'orient=' option in the sgplot approach?

    • Rick Wicklin

      Sure. Use the VBARPARM statement:

      proc sgplot data=FreqOut;
        vbarparm category=smoking_status response=count / group=weight_status  
            seglabel seglabelfitpolicy=none seglabelattrs=(weight=bold);
        yaxis grid;
      run;
  3. hi Rick, my table is following: soil types are serpentine and not serpentine, leaves in these soils are pubescent and smooth. So, it's two variables with two categories each.
    Soil Pubescent Smooth
    Serpentine 12 22
    Not Serpentine 16 50

    I want to do it using input and proc sgplot so that I can get relative frequency of the leaves types in the two soils.
    I am using the following code:
    data ecology;
    input soil $ leaf $ count; datalines;
    serpentine pubescent 12
    serpentine smooth 22
    notserpentine pubescent 16
    notserpentine smooth 50
    ; run;

    proc sgplot data=ecology;
    vbar soil / group=leaf stat=percent groupdisplay=cluster;
    run;

    But the bar graph is showing all the percentage same(24%). Can you please tell me what is the mistake I am doing? Thanks.

  4. Quick question, how were you able to not plot the "total" column of the Freq table?

    Id assume your table is each weight status as a separate column followed by a last and total column

  5. I would like the bars to be ordered in descending order, highest frequency on top followed by lower frequencies. What option can be added to accomplish this in hbarparm statement?

    • Rick Wicklin

      The last example shows an answer to your question:

      proc freq data=sashelp.heart order=freq noprint;
         tables smoking_status / out=FreqOut(where=(percent^=.));
      run;
      proc sgplot data=FreqOut;
        hbarparm category=smoking_status response=count ;
      run;
  6. Pingback: Dice and the correctness of a simulation - The DO Loop

Leave A Reply

Back to Top