I enjoy reading the Graphically Speaking blog because it teaches me a lot about ODS statistical graphics, especially features of the SGPLOT procedure and the Graph Template Language (GTL). Yesterday Sanjay blogged about how to construct a stacked bar chart of percentages so that each bar represents 100%. His chart had the additional feature of displaying the percentages for each category. His post showed how to use the SGPLOT procedure to duplicates the functionality of the G100 option in the GCHART procedure.
In this blog post I present an alternative to Sanjay's construction. When I construct a stacked bar chart, I use PROC FREQ to compute the percentages of each category within a group. I then use the VBAR or HBAR statements in the SGPLOT procedure to construct the stacked bar chart. For example, the bar chart at the top of this post is constructed by using the following statements:
proc sort data=sashelp.cars(where=(Type^="Hybrid")) out=cars; by Origin; /* sort X categories */ run; proc freq data=cars noprint; by Origin; /* X categories on BY statement */ tables Type / out=FreqOut; /* Y (stacked groups) on TABLES statement */ run; title "100% Stacked Bar Chart"; proc sgplot data=FreqOut; vbar Origin / response=Percent group=Type groupdisplay=stack; xaxis discreteorder=data; yaxis grid values=(0 to 100 by 10) label="Percentage of Total with Group"; run;
The VBAR statement in the SGPLOT procedure creates the stacked bar chart. Each bar totals 100% because the Percent variable from PROC FREQ is used as the argument to the RESPONSE= option. The GROUPDISPLAY=STACK option stacks the groups so that there is one bar per category.
Creating stacked bars ordered by percentages
A variation of the 100% stacked bar chart is to order the "slices" of the bars by their relative sizes. This is shown at the left. I used the ORDER= option on the PROC FREQ statement to output the counts in descending order for each bar. If I use the GROUPORDER=DATA option on the VBAR statement in PROC SGPLOT, the categories will be arranged and colored according to how they appear in the data. In particular, the first bar will be ordered by relative percentages, as shown by the "Asia" bar. Notice, however, that this does not guarantee that other bars are ordered by size, as seen by the "Europe" bar.
Still, I like to order the groups by size because it makes it easier to find the relative percentages of the most important groups. In general, I prefer to order charts by some quantity, rather than to use the default alphabetical ordering.
The following statements produce the 100% stacked bar chart with ordered groups, assuming that the data set is already sorted according to the X variable:
proc freq data=cars order=freq noprint; /* ORDER= sorts by counts within X */ by Origin; /* X var */ tables Type / out=FreqOutSorted; /* Y var */ run; proc print data=FreqOutSorted; run; title "100% Stacked Bar Chart Ordered by Percentages"; proc sgplot data=FreqOutSorted; vbar Origin / response=Percent group=Type grouporder=data groupdisplay=stack; /* order by counts of 1st bar */ xaxis discreteorder=data; yaxis grid values=(0 to 100 by 10) label="Percentage of Total with Group"; run;
PROC FREQ does it, too! (SAS 9.4m1)
Over the last few releases there have been quite a few new ODS graphs that come out of PROC FREQ "for free." If you have SAS 9.4m1, you can use the SCALE=GROUPPERCENT option to create a stacked bar chart similar to the one in the previous section:
proc freq data=cars order=freq; tables Type*Origin / plots=freqplot(twoway=stacked scale=grouppct); run;
The lesson to learn is that you can use relatively simple statements (my post) to produce basic stacked bar charts or more sophisticated SAS code (Sanjay's post) to produce more complicated bar charts. Choose the method that you prefer, and let SAS do the rest.