Creating bar charts with group classification is very easy using the SG procedures. When using a group variable, the group values for each category are stacked by default. Using the sashelp.prdsale data set and default STAT of SUM, here is the graph and the code.
proc sgplot data=sashelp.prdsale; title 'Actual Sales by Product and Quarter'; vbar product / response=actual group=quarter dataskin=gloss; xaxis display=(nolabel); yaxis grid; run;
With SAS 9.3, the SGPLOT procedure introduced the option to place the group values side by side. This is a very popular option, where we want to compare category and group values from a common baseline. This is also very commonly used for clinical graphs to view the response by visit and treatment. Here is the graph and the code:
proc sgplot data=sashelp.prdsale; title 'Actual Sales by Product and Quarter'; vbar product / response=actual group=quarter groupdisplay=cluster dataskin=gloss; xaxis display=(nolabel); yaxis grid; run;
But what if you want to get a graph with Cluster and Stacked groups. So, we have three levels of categorization variables, one for category, one for cluster and one for stacks and one response variable. When we introduced the GROUPDISPLAY option, we had an extensive discussion on whether it was important to directly support this feature like in the GCHART procedure. Adding this feature would introduce much more complications in the code. Looking at the use cases, we decided to keep it simple, and support only Stacked OR Clustered groups but not both at the same time.
The reason to do this was that the frequency of use of a bar chart with Stacked AND Cluster groups was low, and there actually exists an easy way to do this using the SGPANEL procedure. Here is the graph and the code.
proc sgpanel data=sashelp.prdsale; title 'Actual Sales by Product, Year and Quarter'; panelby year / layout=columnlattice novarname noborder colheaderpos=bottom; vbar product / response=actual group=quarter dataskin=gloss; colaxis display=(nolabel); rowaxis grid; run;
The additional benefit of using SGPANEL in such cases is when the graph gets too wide. In that case, the SGPANEL procedures knows how to break up the adjacent groups across multiple rows and multiple pages.
Here we want to group Actual revenues placed side by side by Product, stacked by Quarter with Year as the Category.
- We use the SGPANEL procedure and use Year as the Class variable
- We have used layout of ColumnLattice.
- We have used Product as the Category variable.
- We have used Quarter as the Group variable.
- Column headers are moved to the bottom, and the cell borders are suppressed.
- Default group display for the VBGAR is Stack.
- Default statistic is Sum.
Adding a bar label for each bar in the graph is easily done by using the DATALABEL option. A single summarized value is shown at the top of each bar as shown below.
proc sgpanel data=sashelp.prdsale; format actual dollar7.0; title 'Actual Sales by Product, Year and Quarter'; panelby year / layout=columnlattice novarname noborder colheaderpos=bottom; vbar product / response=actual group=quarter stat=sum dataskin=gloss datalabel; colaxis display=(nolabel); rowaxis grid; run;
All this is relatively straight forward. One last (infrequent) use case is where you want to display the response value for each of the stacked segment separately. In the interest of reducing clutter, this is not supported.
However, as can be expected, there is always that one user who absolutely needs such a graph. Such a request recently came in to our Tech Support group, and Lelia contacted me to see how we can create such a graph. This is where SG procedures hit their limits, but GTL can come to the rescue. Here is the graph. GTL code follows.
This graph needs some special coding using the LAYOUT DATALATTICE with the BARCHART statement. To draw the stack of value labels below, we have to use the SCATTERPLOT statement with the MarkerCharacter option. This all would work just fine if the data set contained only one value per combination or Year + Quarter + Product. But, the sashelp.prdsale data set contains more classifiers, and so has multiple observations for each entry. If used directly, we will get a graph all right, but each data value will have multiple impressions on it, making a mess.
So, what we have to do here is summarize the data before hand using the MEANS procedure with three class variables. Now, we will have only one value per combination of Year + Quarter + Product, and the label displayed will be fine. Here is the proc MEANS and GTL code:
/*--Summarize Actual by product, year and quarter--*/ proc means data=sashelp.prdsale; class year product quarter; var actual; output out=prdsale(where=(_type_ = 7) keep=year product quarter ActualSum _type_) sum=ActualSum; run; /*--Template for stacked grouped plot with data labels for all stacked values--*/ proc template; define statgraph StackedClusterBarStat; begingraph; entrytitle 'Actual Sales by Product, Year and Quarter (SAS 9.3)'; layout gridded; layout datalattice columnvar=year / headerlabeldisplay=value columnheaders=bottom border=false rowaxisopts=(offsetmin=0.25 display=(ticks tickvalues label)) row2axisopts=(offsetmax=0.8 display=none) columnaxisopts=(display=(ticks tickvalues)); layout prototype / ; barchart x=product y=actualsum / group=quarter stat=sum dataskin=gloss name='a'; scatterplot x=product y=quarter / markercharacter=actualsum group=quarter yaxis=y2; endlayout; endlayout; discretelegend 'a' / title='Quarter;'; endlayout; endgraph; end; run; /*--Render the graph--*/ proc sgrender data=prdsale template=StackedClusterBarStat; format actualsum dollar7.0; run;
SAS 9.4 adds support for a new statement called AXISTABLE. This statement is able to display stacked data values by class and can summarize the values before display. So, with SAS 9.4, the summarization step can be skipped.
Full SAS 9.3 Program: StackedClusterGroupedVBar