Infographics: Coin Stack Bar Chart

Often we see bar charts showing revenues or other related measures by a classifier using a visual of a stack of coins.  Such visuals are not strictly for the purposes of accurate magnitude comparisons, but more for providing an interesting visual to attract the attention of the reader.  In other words - Infographics.

Coins_India_Jitter_2I thought this would be a good exercise to see how we can do this using the SGPLOT procedure.  One such result is shown on the right.  Click on the graph for a higher resolution image.

I searched the web for some appropriate images of coins, anything with a perspective image of a coin that can be used to create a stack.  Then, I found a beautiful image of an antique "2-Annas" coin from British India.  The image of the coin has beautiful shine, good resolution, unusual shape and clear details that makes for nice stacks of coins as shown above.

BarChartThe default Bar Chart would look like the graph on the right.  While it accurately conveys the information clearly, in some instances it is a bit boring compared to the graph above.

The data for the graph is very simple as show on the right below the graph, and the program is shown below.

SGPLOT code for Bar Chart:

BarDatatitle 'Revenues (Millions) by Year';
proc sgplot data=Bar noborder noautolegend;
  vbar cat / response=resp fillattrs=graphdata1 dataskin=pressed
                    datalabel datalabelattrs=(size=12 weight=bold);
  xaxis display=(noticks noline nolabel) integer ;
  yaxis display=(nolabel noticks noline) min=0 integer grid;
run;

Now, to create the graph of the pile of coins, we need to render each coin in the stack individually, using a SCATTER plot where the marker symbol is built from the image of the coin.  CoinsDataWe process the original data set, and generate an observation for each coin with increasing y value in the data.  Then, the default rendering order (which is data), the later (higher) coins will be drawn over the earlier coins, thus creating a stack.

The data generated for the coin stack is shown on the right.  Note, the response value is kept only once for each category value.

We use the SYMBOLIMAGE statement to define the symbol.  One must use a "transparent" image, where the pixels outside the coin part are transparent.  We also use the JITTER option so each coin is shifted along the x-axis a bit to simulate a "real" stack.  Else, the stack will be too straight.  This jitter option works best when the x-axis values are numeric.

SGPLOT code for Coin Stack Graph:

title 'Revenues (Millions) by Year';
proc sgplot data=coins noborder noautolegend;
  symbolimage name=Coin image="&Coin";
  scatter x=cat y=val / markerattrs=(symbol=Coin size=70) jitter jitterwidth=0.03;
  text x=cat y=resp text=resp / textattrs=(size=14 weight=bold color=white)
         strip position=top backlight=0.75;
  xaxis display=(noticks noline nolabel) integer offsetmin=0.15 offsetmax=0.15;
  yaxis display=none offsetmin=0.2 offsetmax=0.2;
run;

Coins_India_2The value of the stack is displayed on the top of the coins.  Note use of "Backlight" option to generate a darker outline around the text, so it is visible on top of the light colored coins.  Any nice image of a coin can be used.  The number of coins drawn should depend on the "thickness" of the coin in the image.  The 2-Annas coin is thin, so we need more coins.  The graph on the right is without jitter, which creates even stacks.

Coins_Somali_RedThe Somalian Silver coin is "thicker", so we need less numbers as shown on the right.

Full SAS9.40M3 code:  CoinGraph

Post a Comment

Good Graph: Magnitude Comparisons

At the 2013 SAS Global Forum, I presented a paper titled "Make a Good Graph" which reviewed some of the features that make for a good graph.  This paper presents an aggregation of ideas from various sources, including some recommendations from thought leaders in the graphics arena such as Edward Tufte, William Cleveland and Naomi Robbins.

circlesamp (1)Recently, a question was posted on the SAS Communities site asking how to create the graph shown on the right using SAS.   This graph is showing sales figures by company (and peer) by region using a bubble plot.

There are two issues here:

  • How to make such a graph?
  • Should you make such a graph?

Sales_Bubble_3The answer to the first one is simple.  The SAS SGPLOT procedure supports the BUBBLE statement that can create a graph like the one shown above.

On the right is one I created for the simple data in the plot using SGPLOT procedure.  It is relatively easy to make and the generated visual is mostly like the one above, with a few differences.  A more exact match can be created, but I stopped here. Here is the code.

Bubble Plot Code:

ods graphics / reset width=4in height=1.75in noborder imagename='Sales_Bubble';
title 'Sales by Region';
proc sgplot data=sales noborder noautolegend;
bubble y=Group x=Category size=value / bradiusmax=25 bradiusmin=12
group=category dataskin=pressed datalabel=value
datalabelpos=center datalabelattrs=(color=white size=8 weight=bold);
yaxis display=(noline noticks nolabel) fitpolicy=split valueattrs=(size=8);
xaxis display=(nolabel noticks noline) valueattrs=(size=8);
run;

Assuming the purpose of the graph is to better understand a company's sales vis-a-vis a peer, the second question becomes relevant.  Using the bubble plot, it is relatively hard to make accurate magnitudes comparisons of sales figures between the company and its peers without the help of the numbers in the bubble.

The visual shown above would not be the best one to facilitate accurate magnitude comparisons.  It has been shown by studies on the subject that using areas for comparison of magnitude is not very effective.  A better way for such a goal would be usage of linear line segments from a common baseline.  Also, it helps  to bring the items to be compared close to each other.

Sales_BarThe clustered bar chart on the right provides a better visual for magnitude comparisons of sales by region between company and its peer.  Putting the company and peer values adjacent allows for better comparisons which are clearly visible even without the numbers on the bars.

Bar Chart Code:

ods graphics / reset width=4in height=2.5in noborder imagename='Sales_Bar';
title 'Sales by Region';
proc sgplot data=sales noborder;
styleattrs datacolors=(darkgreen gold);
vbarparm category=Category response=value / group=Group
groupdisplay=cluster dataskin=pressed datalabel
datalabelattrs=(color=black size=8 weight=bold);
keylegend / title='';
yaxis display=(noline noticks nolabel) grid;
xaxis display=(nolabel noticks);
run;

Linear distance from common baseline along with proximity of items to be compared create a better graph.  I am thinking it would be a good idea to have a thread for topics on how to create a "Good Graph".  A bit close to "Good Grief", made famous by Peanuts.  🙂

Full SAS 9.40M3 code:  Magnitude

Post a Comment

CTSPedia Clinical Graphs - Subgrouped Forest Plot

The advent of the AXISTABLE statement with SAS 9.4, has made it considerably easier to create graphs that include statistics aligned with x-axis values (Survival Plot) or with the y-axis (Forest Plot).  This statement was specifically designed to address such needs, and includes the options needed to control the text attributes of the data and also any indentations that may be needed.

In previous posts, I have described the use of these new statements, but it seems I did not provide a full program for the "Subgrouped Forest Plot", one of many popular clinical graphs.  Here we can use the YAXISTABLE available in SGPLOT for this graph

Subgroup_Forest_SG_94Here is the graph I created using the SGPLOT procedure.  Click on the graph to see a higher resolution image.  The details for the graph are as follows:

  • A Hazard Ratio plot in the middle.
  • Study names on the far left.  The study names are subgrouped, with label and values.  The labels have bolder font and the values are indented.
  • Number of patients with % on the left.
  • Event rates for PCI Group, Therapy Group and p-value on the right.
  • Note the use of Unicode arrow characters for the annotations on the axis created using the TEXT plot statement.  This is done using the ability to add Unicode values to a User Defined Format in SAS 9.4M3.
  • SG Annotation code is NOT used in this graph.

SAS 9.40M3 code:

title j=r h=7pt '4-Yr Cumulative Event Rate';
ods graphics / reset width=5in height=3in imagename='Subgroup_Forest_SG_94';
proc sgplot data=forest_subgroup_2 nowall noborder nocycleattrs dattrmap=attrmap noautolegend;
  format text $txt.;
  styleattrs axisextent=data;
  refline ref / lineattrs=(thickness=13 color=cxf0f0f7);
  highlow y=obsid low=low high=high;
  scatter y=obsid x=mean / markerattrs=(symbol=squarefilled);
  scatter y=obsid x=mean / markerattrs=(size=0) x2axis;
  refline 1 / axis=x;
  text x=xl y=obsid text=text / position=bottom contributeoffsets=none strip;
  yaxistable subgroup / location=inside position=left textgroup=id labelattrs=(size=7)
                      textgroupid=text indentweight=indentWt;
  yaxistable countpct / location=inside position=left labelattrs=(size=7) valueattrs=(size=7);
  yaxistable PCIGroup group pvalue / location=inside position=right pad=(right=15px)
                      labelattrs=(size=7) valueattrs=(size=7);
  yaxis reverse display=none colorbands=odd colorbandsattrs=(transparency=1) offsetmin=0.0;
  xaxis display=(nolabel) values=(0.0 0.5 1.0 1.5 2.0 2.5);
  x2axis label='Hazard Ratio' display=(noline noticks novalues) labelattrs=(size=8);
run;

I have also added this code to the CTSPedia page for Subgrouped Forest Plot.

Full SAS 9.40M3 code:  Subgrouped_Forest_Plot_SG_94

 

Post a Comment

CTSPedia Clinical Graphs - Volcano Plot

A Volcano Plot is a type of scatter-plot that is used to quickly identify changes in large data sets composed of replicate data.  In the clinical domain, a Volcano Plot is used to view Risk difference (RD) of AE occurrence (%) between drug and control by preferred term.

One example of a volcano plot, P-risk Odds Ratio of Treatment Emergent Adverse Events is contributed by Qi Jiang and is included in the list of Clinical Graphs on the CTSPedia web site.  The graph is a used for safety signal screening and AE data display.  It allows investigators to evaluate AE risks using both estimates of risk difference and p-values.   Optional reference lines are added so that AEs with large RD and small p-values can be identified in the upper right corner of the plot.

Volcano_RRI took the data from the example on the CTSPedia example and used the SGPLOT procedure to create the graph shown on the right.  Click on the graph for a higher resolution view.

The graph plots the log of the p-values by the log of the Odds Ratio by AESOC.    For this graph, I have displayed the AE terms for p-values > 0.05 are labeled.  The scatter plot is by AESOC, and the I have set a format to display 10 characters in the legend which is placed on the right of the plot.

title 'P-risk (Odds Ratio) Plot of Treatment Emergent Adverse Events at PT Level';
proc sgplot data=sample2;
  format aesoc $10. text $txt.;
  label p_rr='Fisher Exact p-value';
  label rr='Odds Ratio';
  scatter x=rr y=p_rr / group=aesoc datalabel=label name='a';
  refline 1 / axis=x lineattrs=(pattern=shortdash);
  refline 0.05 / axis=y lineattrs=(pattern=shortdash);
  inset ("Placebo:" = "n/N(%)=&inset1"
              "Treatment:" = "n/N(%)=&inset2") / noborder position=topleft;
  text x=xlbl y=ylbl text=text / position=bottom contributeoffsets=(ymax);
  yaxis reverse type=log values=(1.0 0.1 0.05 0.01 0.001) offsetmin=0.1;
  xaxis type=log values=(0.1 1 2 5 10) valueshint;
  keylegend 'a' / across=1 position=right valueattrs=(size=6);
run;

It was not obvious to me  how the inset values were computed.  These would computed in a data step and inserted into macro variables for display in the graph.  So I just assigned the values into macro variables and used those in the INSET statement.

Also note the use of the Unicode characters for the left and right arrow in the "Favors" labels.  We do this by making these texts as "T" and "P" in the data, but use a user defined format that includes Unicode values in the text.

If you compare with the code on the CTSPedia site, you will see the SGPLOT code is very concise and does not require use of Annotation.  This makes the graph code more robust and usable with other data.

Full SAS 9.40M3 SGPLOT code:  VolcanoPlot

Post a Comment

CTSPedia Clinical Graphs - Heatmap of Benefit

Let us continue our review of the Clinical Graphs included in the CTSPedia repository.  Today, I noticed this Heatmap of Benefits and Risks over Time for Subjects in a study by Treatment, submitted by Max Cherny using "R" code.  I thought it would be a good exercise to see how to build this graph using SAS.  You may notice, I have already added the SAS version to the CTSPedia Heat Map page.  This graph was intentionally created to mimic the graph by Max to avoid any variability.

HeatMapPanel3Here is the same graph created using SAS SGPANEL procedure.  In this example I have softened the colors used by Max.  The graph is pretty much the same.  Click on graph for a higher resolution view.

The full code for generating the data and rendering the graph is linked below.  You will see that most of the code is needed to generate the data set as I was unable to run Max's R code to do so.

The SGPANEL code necessary to create the graph itself is very concise, under 10 lines.

Here is the SGPANEL code:

proc sgpanel data=HeatMap ;
  format value benefit.;
  styleattrs datacolors=(cx3faf3f yellow lightgray lightred gray);
  panelby trt / novarname spacing=10 headerbackcolor=lightgray;
  heatmapparm x=week y=subject colorgroup=value / name='a';
  colaxis integer offsetmin=0 offsetmax=0 display=(noline);
  rowaxis values=(10 to 100 by 10) min=1 valueshint
                 offsetmin=0 offsetmax=0 display=(noline);
  keylegend 'a' / valueattrs=(size=7) noborder;
run;

The full code for the graph is linked below.  I have used some options to set the size of various fonts to mimic the "R" look for comparison.

Full SAS9.4 SGPANEL code:  HeatMap2

Post a Comment

Dial KPI using SGPLOT

Last week I was at PharmaSUG 2016, where I presented a 1/2 day seminar on creating Clinical Graphs using SAS.  I was gratified to have a enthusiastic audience of about 28 attendees and we had a great interactive session.  I also presented a paper on Clinical Graphs Using SAS.  More on PharmaSUG 2016 soon.

While at PharmaSUG, I read a post by a user interested in creating KPI dials of the type that can be found on the SAS Support Page.  User wanted to know if such KPI Dials can be created using GTL.  Interestingly, a couple of attendees at the conference also expressed interest in it.  Also, Kirk Lafler demonstrated a way to create Dashboards using Base SAS software.  I figured it would be an interesting exercise to create one using SGPLOT procedure and may actually be useful for multiple applications.

KPI_Dial_Pastel_1The result is shown on the right.  Click on the dial for a higher resolution view.  I wanted to create something that would look like a real dial, and not with a "flatland" rendered look.  The dial on the right has a three zones with ticks and values, a needle showing the current value, and the value is also displayed at the bottom of the dial.

What distinguishes this from the ordinary "flatland" graphic is the nice shiny outer chrome ring, and reflective highlights on the surface of the dial simulating a glass cover.  It may be nice to add a small window frame for the value.

The chrome ring is displayed using an image of a ring with transparent outer regions.  The inner regions need not be transparent as it will be covered with the dial details.

KPI_Dial_Pastel_2_BA second version is shown on the right.  Here I have used a pastel shade for the dial background and set the value to 23.  The intensity of the reflection map is reduced by increasing its transparency.

The full program for computing the data and the SGPLOT program is attached below.  It uses two image files, one for the dial background and one for the dial reflection map.  If you want to run the program, be sure to use images that will provide similar results.

It should be possible to write a macro to create such Dial KPIs on the fly, where you can define the number of ranges and their values and the value for the dial.  Should be doable by converting the code attached into a macro.

SAS 9.4 SGPLOT Code:  KPI_Dial_2

 

Post a Comment

CTSPedia Graphs - Dot Plot of Primary SOC

CTSPedia is a valuable resource for clinical research "... initiated to form an information resource created by researchers for researchers in clinical and translational science to share valuable knowledge amongst local researchers".

This site includes a section on statistical graphs where you can find valuable information and a library of standardized graphs for the Clinical Trials industry.    The library includes many graphs, some having "R" code and some "SAS".  Many SAS graphs are now a bit dated, using SAS 9.2 features, and could be refreshed using the newer SAS 9.4 features.

One graph of interest is the Dot Plot of Primary SOC by Matt Soukup created using R.  I thought it would be a good exercise to do the same graph using SAS 9.4.

Dot_SOC_SAS_AxisTable_AI was not able to run Matt's R program to get the data, but it was easy enough to enter the pct values by hand. The resulting graph is shown on the right and the SAS code is below.  Click on the graph for a high resolution image.

proc sgplot data=dot_sort noborder;
  dropline y=classification x=pct / dropto=y;
  scatter y=classification x=pct /
               markerattrs=(symbol=circlefilled);
  yaxistable classification pct / location=outside
              position=left pad=10 valuejustify=right ;
  xaxis min=0 grid offsetmin=0 label='Relative Frequency of an Event';
  yaxis fitpolicy=none valueattrs=(size=7) reverse display=none;
run;

My goal is to get as close to the graph in the CTSPedia library in appearance.  A user can always further customize it.  The graph above is almost identical to the graph shown in the CTSPedia link, and easy to create using the YAxisTable.

I am thinking it would be useful to write up a thread of such "CTSPedia" graphs which can be easily searched in the blog using the keyword "CTSPedia".

I will be at PharmaSUG 2016 next week in Denver.  I look forward to seeing you there.   I am presenting a 1/2 day seminar on Saturday on "Clinical Graphs using SAS", a paper and a SuperDemo on the same topic.  Stop by and say "Hello" at the "Meet the presenters" booth if you have time.

Full SAS 9.4 SGPLOT code:  Dot_Plot_SAS

Post a Comment

Directed Link Networks

A few weeks ago I posted an article describing how to display simple Network Diagrams with Curved Links using SGPLOT procedure.  The key requirement is that the node positions have to be computed by user.  Often, for simple diagrams, nodes can be positioned using a simple layered layout.  Separately, I also posted some articles on creating InfoGraphs using SAS.

Network_Icon_Single__15_BIn this article, I am combining these ideas to display a diagram with directed curved links.  In the graph on the right, I have replaced the nodes with icons for the persons involved in the social network.  These icons could represent people or types of services, like "Providers" and "Patients".

As described in the previous article, the node and link data needs to be provided.

Node data includes "NodeId", "Group" for type of node, the "Image Icon" that will represent the group, and the NodeX and NodeY.

The link data includes"LinkId", "From" and "To" nodeids.  Optionally, we could include some measure of the link response.  In this example, I have skipped link response.  Arrowheads are added to display the direction of the link flow.

SAS 9.4 program:

proc sgplot data=network2(where=(linkid ne 10)) noautolegend aspect=1;
  symbolimage name=Group1 image="&file1";
  symbolimage name=Group2 image="&file2";
  styleattrs datasymbols=(group1 group2 group3 group4 group5
                    group6 group7 group8 group9) backcolor=cxfaf3f0;
  spline x=xls y=yls / group=LinkId lineattrs=graphdatadefault arrowheadpos=end 
      arrowheadshape=filled arrowheadscale=0.5;
  scatter x=xn y=yn / group=group markerattrs=(size=40)
     dataskin=sheen datalabel=name datalabelpos=bottom;
  xaxis min=0 max=4 display=none offsetmin=0.1 offsetmax=0.1;
  yaxis min=0 max=4 display=none offsetmin=0.15 offsetmax=0.15;
run;

In the above program, I have trimmed some of the SymbolImage statements to conserve space.  We need nine such statements to cover all the icons as shown in the code linked below.


Network_Icon_Double_15_BDouble Links.  
The diagram above only has single links.  So, having curved links is mainly an aesthetic feature.  The curved links are computed as a 3 node curve displayed using the SPLINE statement with arrowheads.  The middle node is computed using vector math as shown in the full code linked below.

However, curved links are essential for diagrams with double links.  In the diagram on the right, "Ted" and "Bill" have links in both direction.  In such cases, we need to curve the links so we can see each of the link and the arrowhead clearly. The nice thing is the algorithm for computing the offset aubomatically goes in the other direction if the "From" and "To" nodes are reversed.

 

Network_Icon_Double_25_BThe code uses a factor "Off" to determine the "Bend" of the curve.  Off=15 means the bend is 15% of the length.  The graph on the right uses a "Bend" of 25% as indicated in the title.

The code also uses "FS" as the factor making the link end stop short of the mode.  Value of FS is the % of the distance to be shortened in the direction of the vector to the midpoint.  This is important as if we shorten to the original point, it does not look right for curved links.

Shortening the links is important as we do not want to hide the arrowhead when we draw the icons or nodes on top.

Just for comparison, the diagram below has a "Bend" of 40%.

 

Network_Icon_Double_40_BFull SAS 9.4 code:  Directed_Links_Icons

Icon Zip file:  Node_Icons

To run the code, you will need to put the icons in a folder on your computer, and then put the folder name in place of <Icons folder> in the code.

 

Post a Comment

Coffee Recipes

PhilzFor a long time, Starbucks represented to me as the good coffee cup, with me paying upwards of $4 for a Latte.  But on a recent visit to San Francisco, my son introduced me to a few other options.

Philz crafts a great cup of java, with the barista making the coffee right there in front of you, just like you want it, and they'll go the distance to get it right.  I believe I got the "Jacob's Wonderbar" with heavy cream.  It was great.

Blue_BottleThen, I was introduced to Blue Bottle, another great (and maybe local) coffee bar.  The ambiance was great.  I got a Cafe Mocha.  The Barista did a great design on the top by hand as shown on the right. Since the coffee has chocolate, it was mildly sweet already.  It was served in a nice porcelain cup that was very enjoyable.

All this turned out to be a great lead-in to the article I had planned on creating an interesting graphic for coffee recipes in continuation on the series on "Info-Graphs".

Now to be sure there are many great coffee recipes, but I limited my task to display four common ones in a panel using the data shown below.  The idea being this is all data driven, and more recipes can be easily added to create more graphs.
Coffee_Data_2

The four recipes are listed, each with its ingredients.  "Expresso" has only Expresso, "Macchiato" has Expresso and milk foam, and so on.  The fraction of the volume of the cup is shown under the "Value" column.  So, "Cafe Latte" has 25% Expresso, 50% steamed milk and 25% milk foam.
First, we start with a plain stacked graph showing the recipes for each coffee type.  The code is shown below.  I used a HighLow plot instead of the VBarParm because I need the flexibility to raise the bottom of the bar to adjust to the mask later.  For this, the two modified columns "Low" and "High" are used.

Coffee_HighLow_1The result is shown on the right.  The graph includes a legend to identify the ingredients.  A discrete attributes map is used to set the preferred colors for each ingredient.

title j=l h=1 'Coffee Recipes';
proc sgplot data=Coffee noborder   noautolegend nocycleattrs
                     dattrmap=attrmap pad=(bottom=10pct);
  highlow x=name low=low high=high / group=group type=bar
                 nooutline barwidth=0.7 name='a' attrid=Coffee;
  keylegend 'a';
  xaxis display=( nolabel noticks) offsetmin=0.12 offsetmax=0.12;
  yaxis display=none min=0 max=1 offsetmin=0.15 offsetmax=0.42 values=(0 0.5 1);
run;

Coffee_HighLow_Text_1Next, we improve the graph by labeling each ingredient directly in the bar chart using a Text plot.  Now, we can do away with the legend.  The information is easier to consume with the ingredients labeled directly, thus reducing eye movement to decode each using the legend.

Note the use of "Backlight" for the Text plot.  This ensures that text that is light on light can still be easily read.  Click on the graph for a higher resolution image.

title j=l h=1 'Coffee Recipes';
proc sgplot data=Coffee noborder noautolegend nocycleattrs
                     dattrmap=attrmap pad=(bottom=20pct);
  highlow x=name low=low high=high / group=group type=bar
                  nooutline barwidth=0.7 name='a' attrid=Coffee;
  text x=name y=mid text=group / backlight=0.4
              textattrs=(size=6 color=white);
  xaxis <options>;
  yaxis <options>;
run;

Coffee_HighLow_Text_Mask_1Now comes the fun part.  We use the SAS 9.4 feature to define a marker from an image icon.  In this case, we use an icon of the coffee cup.  Then, we can layer a SCATTER plot over the HighLow to create the graph shown on the right.

We use an image where the middle portion of the cup is transparent.   The cup part is in dark color and the outer pixels of the image are white.  When this icon is layered over the bar, the bar colors show through the transparent portion of the cup, thus producing an interesting and memorable graphic.

Coffee_HighLow_Text_Mask_Steam_1Finally,  we use another icon to display the steam rising from the coffee.  I hope you enjoy this cup of Java.

Thanks to Riley Benson, our super UX expert who helped me clean up the icons so the result was suitable for publishing.

See full code below for all the details.  I have also attached a zip file of the icons needed for the graph.

To run the program, you will need to put the icons a in folder and supply the full path name to that folder in place of the <your folder> in the code for "Cup" and "Steam" macro variables.

Full SAS 9.4 Code:  Coffee

Icon ZIP file:  Icons

 

 

 

Post a Comment

Displaying Group Values on the Axis

Recently a user was working with the HBAR statement with cluster groups with SG procedures.  User wanted to see the group values on the axis.  SGPLOT does not display multi level axes as these are shared with different plot types.  However, with SGPLOT, there is often a way to get what you want.

As frequent readers of this blog know that the real power of SGPLOT is in the myriad ways you can combine different compatible plot types together in one graph to create just the graph you need.  With SAS 9.4, you options are even greater.  Let us see what we can do in this case.

HBarChartHere is a basic cluster grouped bar chart in SGPLOT for the SASHELP.CARS data set.  We are viewing an HBAR of Response=mpg_city by Category=Type and Group=Origin.

The category values are displayed on the Y axis.  All the group values for each category are displayed within each category, clustered around the tick mark.  Each group value is colored by the group, and the values are displayed in the legend.

title 'Mileage by Type and Origin';
  proc sgplot data=sashelp.cars(where=(type ne 'Hybrid')) noborder nowall;
  hbar type / response=mpg_city stat=mean group=origin groupdisplay=cluster
           dataskin=pressed filltype=gradient baselineattrs=(thickness=0);
  xaxis display=(noline noticks nolabel) grid;
  yaxis display=(nolabel);
run;

HBarChartLabelUser wants to see the group values on the axis itself.  To do this in SGPLOT, we can layer the values on top of each bar using a TEXT plot (SAS 9.4) or a SCATTER plot with MARKERCHAR option (SAS 9.3).

However, note that the HBAR statement does not allow layering of other basic plot types with it.  So, to do this, we have to first summarize the data ourselves using the PROC MEANS procedure.  Then, we will use the HBARPARM statement to draw the pre-summarized data with the group values.  Click on the graph for a higher resolution image.

/*--Summarize the data by Type and Origin--*/
proc means data=sashelp.cars(where=(type ne 'Hybrid')) noprint;
  class type origin;
  var mpg_city;
  output out=cars(where=(_type_ > 2))
               mean=Mileage;
  run;

/*--Add x and Y label locations--*/
data cars;
  set cars;
  xlbl=0.1; ylbl=0.1;
run;

/*--HBAR with cluster groups--*/
ods graphics / reset width=5in height=3in imagename='HBarChart';
title 'Mileage by Type and Origin';
proc sgplot data=sashelp.cars(where=(type ne 'Hybrid')) noborder nowall;
  hbar type / response=mpg_city stat=mean group=origin groupdisplay=cluster
            dataskin=pressed filltype=gradient baselineattrs=(thickness=0);
  xaxis display=(noline noticks nolabel) grid;
  yaxis display=(nolabel);
run;

HBarChartAxisTableIf the user really wants the group values displayed on the axis like in GCHART, we can use the AXISTABLE to draw the values to the left of the bars as shown on the right.  Now, the label for the group values is also displayed at the top of the axis.  We have removed the legend, and could also have changed all the bars to a single color if needed.

title 'Mileage by Type and Origin';
proc sgplot data=cars noborder nowall noautolegend;
  hbarparm category=type response=mileage / group=origin groupdisplay=cluster
      dataskin=pressed filltype=gradient baselineattrs=(thickness=0);
  text y=type x=xlbl text=origin / group=origin groupdisplay=cluster
       textattrs=(color=black size=7) position=right contributeoffsets=none;
  xaxis display=(noline noticks nolabel) grid;
  yaxis display=(nolabel);
run;

HBarChartAxisTable2For a consistent look, we can do the same treatment for the category variable using another Text plot, and suppress the Y axis entirely as shown on the right.

Note, now we have the category axis on the outside.  The group values are within each category tick value as indicated by the alternate color bands.  The labels for the category and group axis are displayed at the top of each values.

title 'Mileage by Type and Origin';
proc sgplot data=cars noborder nowall noautolegend;
  hbarparm category=type response=mileage / group=origin groupdisplay=cluster
          dataskin=pressed filltype=gradient baselineattrs=(thickness=0);
  yaxistable type / location=inside position=left
         valuejustify=right;
  yaxistable origin / class=origin classdisplay=cluster location=inside position=left
        valuejustify=right valueattrs=(size=6) labelattrs=(size=7);
  xaxis display=(noline noticks nolabel) grid;
  yaxis display=none colorbands=odd colorbandsattrs=(transparency=0.5);
run;

VBarChartLabelVertWith a VBAR, adding group values can be a bit tricky as there may not be enough space to display all the group values in the space available.  However, we can work around this issue by rotating the group values as shown on the right.

While I have shown you some ways to get an alternative look for the clustered bar chart, one can be sure you can customize the graph to your own specifications by combining the plot statements as you need.

title 'Mileage by Type and Origin';
proc sgplot data=cars noborder nowall ;
  vbarparm category=type response=mileage /  group=origin groupdisplay=cluster
         dataskin=pressed filltype=gradient baselineattrs=(thickness=0);
  text x=type y=ylbl text=origin / group=origin groupdisplay=cluster rotate=90
         textattrs=(color=black size=7) position=right contributeoffsets=none;
  yaxis display=(noline noticks nolabel) grid;
  xaxis display=(nolabel);
run;

Full SAS 9.4 program:  Group_Labels

Post a Comment