Side-by-side bar plots in SAS 9.3

26

When I was at the Joint Statistical Meetings (JSM) last week, a SAS customer asked me whether it was possible to use the SGPLOT procedure to produce side-by-side bar charts. The answer is "yes" in SAS 9.3, thanks to the new GROUPDISPLAY= option on the VBAR and HBAR statements. For example, the following SAS code creates side-by-side bar charts that enable you to compare the relative frequencies for Asian, European, and American vehicles across several different types of vehicles:

/* SAS 9.3 supports side-by-side plots */
proc sgplot data=sashelp.cars;
vbar type /group=Origin groupdisplay=cluster;
run;

You can't get quite the same look from SAS 9.2, although you can panel the data to get the general appearance:

/* SAS 9.2 doesn't support side-by-side charts, 
   but you can panel the data */
proc sgpanel data=sashelp.cars;
panelby type;
vbar Origin / group=Origin;
run;

The GROUP= option is optional, but results in the bar being colored according to the levels of the Origin variable.

If you have SAS 9.2M3, my colleague, Sajay Matange, told me about an option that almost enables you to get side-by-side bar charts. You can use the LAYOUT= option to tell the SGPANEL procedure to use only one row for the panel. If you have only a few categories (in this case, types of vehicles), this trick works, but if you have too many categories, the procedure "helpfully" makes additional rows of plots:

/* If you have SAS 9.22 (=9.2M3) try this, which only works 
for about 3 groups (after that the panel makes a new row) */
proc sgpanel data=sashelp.cars;
   where type="SUV" | type="Sports" | type="Wagon";
  panelby type / layout=columnlattice 
                 colheaderpos=bottom rows=1 novarname;
  vbar Origin / group=Origin;
  colaxis display=none;
  rowaxis grid;
run;

Because of helpful comments like that, I'm sure to pre-order Sanjay's forthcoming book, Statistical Graphics Procedures by Example: Effective Graphs Using SAS.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

26 Comments

  1. Sanjay Matange on

    The GROUPDISPLAY=CLUSTER was quickly added at SAS 9.3, but since the release was long way off, we needed an interim solution. So, we provided a few special options at SAS 9.2(TS2M3) to get a close facsimile.

    Three options were added. These are LAYOUT=COLUMNLATTICE, ONEPANEL and NOBORDER. With use of these options on the PANELBY statement, you can get a reasonable close cluster Bar Chart. The added benefit is that if you do have a large number of groups, you can get a grid if you wish. If you look closely, you will see small breaks in the horizontal gridlines aligned with the cell header (bottom) borders.

    Here is Rick's code with these options added to acheive the results. The resulting graph can be viewed at: https://sites.google.com/site/smatange/ods-graphics/cluster-group-bar-chart-sas-9-2m3

    SAS 9.2 (TS2M3) code:
    proc sgpanel data=sashelp.cars;
    panelby type / layout=columnlattice onepanel
    colheaderpos=bottom rows=1 novarname noborder;
    vbar Origin / group=Origin;
    colaxis display=none;
    rowaxis grid;
    run;

  2. I remember reading somewhere that SGPLOT will include the ability to do a side-by-side barchart like the ones you have here, but with an overlaid line (or lines?). We use these charts for showing rates of change on a secondary vertical axis. Up till now we've had to utilize Excel for these types of graphs, but I'm hopeful that we'll be able to avoid Excel and stick to SAS 9.3 and still be able to create our grouped side-by-side barcharts with lines on a secondary vertical axis. For example, the horizontal axis is often dates for us, while the left vertical axis might be in dollar net losses and the right vertical axis might be the percentage rate of losses.

    Is this possible using SGPLOT in 9.3, and what does the syntax and results look like?

    Thanks in advance!

    • Rick Wicklin

      Yes, this is possible. You can add a VLINE statement and use the Y2AXIS option. For example:
      vline type / y2axis response=mpg_city stat=mean group=Origin groupdisplay=cluster;

      It looks just like a line plot overlaid on a bar chart.

  3. New question: I'm running into some issues with my unique problem--which I don't anticipate will be that unusual for yourself and others. Basically, what I'm trying to do with the side-by-side barchart with overlaid line is graph the portfolio size of the bank I work for versus the portfolio size of our peer group and both versus unemployment. I want these side-by-side bars by quarter, with an overlaid line representing unemployment.

    The difficulty I'm running into is that no matter how I shape my data I'm getting goofy results. The use of "grouping" with a grouping variable results in two lines for unemployment--and two unemployment entries in the legend: one for the peer group and one for my bank. This is clearly because when I add on the unemployment data I do so by merging by date (in quarterly format) which causes values to be duplicated because there are two portfolio balanace entries for each date (one for my bank, one for our peer group) . However, my other solution doesn't work any better. Essentially, I tried just appending the unemployment data (using the set statement method), and creating a grouping variable of unemployment (to differentiate from Peer and Bancorp). This led to empty unemployment columns for Peer and Bancorp, and empty portfolio balance columns for Unemployment...which looks funky, but should be fine...except when I graph it I see in the legend four occurences of Unemployment. One occurence is apparently a bar (of zero height); one occurence is a dashed line; the other two occurences are straight red lines which don't show up in the graph (I'm guessing because they representing the "missing" unemployment data associated with the two groups: Peer and Bancorp.

    Oh, and so on the XAXIS I put date (in quarterly format); on the primary vertical axis I put portfolio balance, and on the secondary vertical axis I put unemployment rate.

    I'm at a loss; I'm not sure how to overlay my line which represents an essentially ungrouped series (unemployment) over a grouped barchart. Any advice you can provide is much appreciated.

  4. The following doesn't seem to run in sas 9 for me

    /* SAS 9.3 supports side-by-side plots */
    proc sgplot data=sashelp.cars;
    vbar type /group=Origin groupdisplay=cluster;
    run;

  5. I would like to show percentage for y-axis. I only find options for sum, mean or frequency. Is there any way to do it by using sgplot? Thanks,

    • @Eric,

      I found that I wanted the x-axis to provide a percentage instead of a frequency as well. I've found that you have to create an alternate data set with a percentage variable (I did this by outputting proc freq) and then indicating in the options that you want response=percent.

      An Example is given below:

      proc sort data = sashelp.cars out = cars;
      by type;
      run;
      proc freq data = cars noprint;
      by type;
      tables Origin / out = carsfreq;
      run;
      proc sgplot data=carsfreq;
      vbar type /group=Origin groupdisplay=cluster response = percent;
      run;

  6. Anyone know how I can have multiple charts side by side with different variables? ie I want the average for var1 next to the average for var2, for var3, etc. Any way to show this on one plot?

  7. I'm working with SAS 9.2 and trying to do side-by-side barplots. That's the reason why I found this post. Basically I'm happy with the solution with proc sgpanel as mentioned above. But I still have one big problem: How can I change the colors of the bars? (To be more specific for the example with the cars: The colors should be still grouped by the Origin of the cars.) I tried a lot of things but nothing worked...

    Thanks in advance.

  8. Hi, thanks for these tools I didn't know about...
    But in my case, I would like to represent the column percents (not the row ones), as to say visualize a proc freq with /nocol nopercent options in a VBAR, you see ?
    Because I want to show specifically the difference in proportions between different groups, not globally.

    As you don't mention it in your article, I wonder if this is possible with sgplot. Or is there a way to exploit the results of a previous proc freq (with /nocol nopercent options) in a VBAR plot ?
    I tried to do that, but couldn't figure it out...
    Thanks in advance for your help !

  9. Thaks for your quick answer ! The problem comes from the VBAR statement computes statistics automatically. It seems impossible to make it graph stats that would have been already produced.
    I managed to do this in R, as in fact the barplot function doesn't compute the stats itself and accepts whatever table you put in... as for example my expected "column percent" table ! It worked very fine this way.
    But I'll check this TABLE statement you proposed ! Thanks again for the support !

  10. Worked fine, thanks alot ! I could use it with both category and response arguments and group option.
    Thanks for your quick and efficient support !

Leave A Reply

Back to Top