Box Plot with URL links

Last week I was out to the 2nd Conference on Statistical Practice in New Orleans.  It was a great opportunity to meet many users of SAS, R and other software and hear about their projects in applied statistics.   I will write up my feedback on this conference soon.

In the meantime, a user chimed in on the SAS communities' page about the need to add URL links to a Box Plot, such that each box would drill down to a page provided by user for more details about that category.  The BoxPlot statement in GTL does not support a URL option like some of the other plot statements.  We have this feature on request and will address this in the near future.

The user wondered whether it made sense to overlay a scatter plot for the mean value for each box with the URL option to drill to the web page.  Far from being "silly" (as per user), this is exactly the right way to deal with many such situations in GTL or SG Procedure programming, where often other plot statements can be leveraged to do something not supported by the statement you are using.

My initial suggestion was to use a transparent Bar Chart instead, as that would cover the entire width of the category, and provide more space for clicking.  Also, the cursor change over the entire area of a category would indicate that this can be drilled into.  On further discussion, we realized a Bar Chart is not suitable as it enforces the inclusion of "zero" response value, otherwise not present in the graph.

With SAS 9.3, the way to go is to use the HIGHLOW bar as the transparent overlaid plot to provide the drill down feature.  Normally, you would make such a plot fully transparent, but I set it to 80% transparent just so we can see the result.

Here is the SAS 9.3 code:

/*--Define the template--*/
proc template;
  define statgraph BoxURL_1;
    begingraph;
      layout overlay / yaxisopts=(label='Mileage');
	 boxplot x=type y=mpg_city /  tip=(none);
	 highlowplot x=type low=mpg_min high=mpg_max / type=bar display=(fill)
            url=url datatransparency=0.8 rolename=(url=url) tip=(url);
      endlayout;
    endgraph;
  end;
run;
 
/*--Render graph to HTML file--*/
ods html file='BoxURL_1.htm';
ods graphics / reset imagemap=on width=5in height=3in imagename='BoxURL_1';
proc sgrender data=cars template=BoxURL_1;
run;
ods html close;

Note the following features in the graph above, which is a screen capture from the IE Browser:

  1. Each box plot has a faint gray box around it drawn by the HIGHLOW plot with DATATRANSPARENCY=0.8.  In the real use case, we would use a transparency of 1.0.
  2. The tips for the box are turned off, and for the HIGHLOW plot, we set tip=(url) to display the name of the page to be drilled to if clicked.
  3. In a data step preceding the graph statements, we compute the min and max value for each category value, and set that as the mpg_min and mpg_max values for only one observation per type.  Other values are set to missing.

We can make other changes like shown in the second example included in the attached program, where we put the HIGHLOW plot behind the box, and turned on the tooltips for the box.  Now, when you mouse over the box element, you will see the tips for the box values.  But when you mouse over the HIGHLOW, you will see the drill target.

Full SAS 9.3 programBox_URL

Post a Comment

GTL Layouts

The Graph Template Language (GTL) provides you the ability to create complex graphical layouts. We have seen how to create a regular grid of cells based on one or more classification variables using the SGPANEL procedure.   Each cell contains the same type of plot.  This topic was covered in Dan's article on Sorting Paneled Graphs.  The SGPANEL procedure essentially uses the related GTL DATALATTICE layout behind the scene.

But what if you want to create a specific non-regular layout of different plots?  This is where you need to use the GTL LATTICE layout.  This layout provides you powerful and flexible ways of arranging your graphs with or without common axes or common regions.  In this article we will go over some of the features of this layout.

Simple Lattice Layout:

Here is an example of a simple 2-cell side by side layout.  Each cell can be populated with its own plots, essentially any combinations of compatible plot types in a Layout Overlay.  Here I have shown only the layout.

GTL Code for simple layout:

proc template;
  define statgraph Simple_Layout;
    begingraph;
      entrytitle 'Blood Pressure';
      layout lattice / columns=2 columngutter=5;
        layout overlay / walldisplay=(outline)
               xaxisopts=(display=none) yaxisopts=(display=none);
          scatterplot x=weight y=systolic / markerattrs=(size=0);
	  entry halign=center textattrs=(size=20) "1" / valign=center;
        endlayout;
        layout overlay / walldisplay=(outline)
               xaxisopts=(display=none) yaxisopts=(display=none);
	  scatterplot x=weight y=diastolic / markerattrs=(size=0);
	  entry halign=center textattrs=(size=20) "2" /  valign=center;
        endlayout;
      endlayout;
    endgraph;
  end;
run;
 
proc sgrender data=sashelp.heart template=Simple_Layout;
run;

Here are the salient points of the code above:

  • We have used one LAYOUT LATTICE with a TITLE.
  • The layout has two columns, with a gutter between them.
  • Each cell contains a SCATTERPLOT and an ENTRY.
  • The scatter plot is required to make the layout code work, but since we want to see the empty space, we have set marker size to zero.
  • The entry statement is used to display the cell numbers.
  • The display of the  axes have been suppressed.
  •  To create any output, we need to run the template with a real data set.

Simple Lattice Layout with Plots:

Here we have taken the next step of populating each cell with the plots.  Here, the content of each cell is a single scatter plot and a legend, but any combination of compatible plots can be used within a Layout Overlay.

GTL Code for simple layout:

proc template;
  define statgraph Simple_Layout_Graphs;
    begingraph;
      entrytitle 'Blood Pressure by Weight';
      layout lattice / columns=2 columngutter=5;
        layout overlay / xaxisopts=(griddisplay=on)
                         yaxisopts=(griddisplay=on);
	  scatterplot x=weight y=systolic / group=sex name='s';
	  discretelegend 's';
        endlayout;
        layout overlay / xaxisopts=(griddisplay=on)
                         yaxisopts=(griddisplay=on);
	  scatterplot x=weight y=diastolic / group=sex name='d';
	  discretelegend 'd';
        endlayout;
      endlayout;
    endgraph;
  end;
run;
 
proc sgrender data=sashelp.heart template=Simple_Layout_Graphs;
run;

Here are the salient points of the code above:

  • Each cell is populated with a scatter plot of response by weight and sex.
  • Each cell has a complete graph, with axes and legends.
  • Note, the two Y axes are different and NOT uniform.

Lattice Plots with Common Axes and Legends:

Now, let us improve the graph by using a common Y axis, so we can see the relationship between the Systolic and Diastolic blood pressure.  This also saves some space and reduces potential confusion.

GTL Code for layout with common axes and legends:

proc template;
  define statgraph Layout_CommonAxis_Sidebar;
    begingraph;
      entrytitle 'Blood Pressure by Weight and Sex';
      layout lattice / columns=2 columngutter=5 rowdatarange=union;
        rowaxes;
	rowaxis / griddisplay=on label='Blood Pressure';
        endrowaxes;
 
        column2headers;
	entry 'Systolic';
	entry 'Diastolic';
        endcolumn2headers;
 
        layout overlay / xaxisopts=(griddisplay=on);
	  scatterplot x=weight y=systolic / group=sex name='s';
        endlayout;
        layout overlay / xaxisopts=(griddisplay=on);
	  scatterplot x=weight y=diastolic / group=sex name='d';
        endlayout;
        sidebar / spacefill=false;
	 discretelegend 'd';
        endsidebar;
      endlayout;
    endgraph;
  end;
run;
 
proc sgrender data=sashelp.heart template=Layout_CommonAxis_Sidebar;
run;

Here are the salient points of the code above:

  • The ROWDATARANGE is set to UNION.
  • A ROWAXES block is defined to create a single external Y axis.
  • The DISCRETELEGEND is placed in a SIDEBAR.
  • COLUMN2HEADERS are used to display the response variable.

Nested Layouts:

Layouts can be nested inside other layouts to create a more complex arrangement.  Here is an example of such a nexted layout.  See attached full program to see the code for this arrangement.

The Subgrouped Forest Plot is an excellent example of the usage of the GTL LATTICE layout.

Full SAS Code:  GTL_Layout_Lattice

Post a Comment

Percent VBar

Recently a reader chimed in with a question on the Do Loop article by Rick Wicklin on how to create a bar chart with percent statistics.  Rick used SAS 9.3  and the reader wanted to do the same with SAS 9.2.

For the basic (non-grouped) bar chart, the process is the same as described by Rick.  The VBAR only supports the FREQ, SUM and MEAN statistics.  So, to plot the percentages, you have to compute the statistics using proc FREQ., and then plot the percent column as the RESPONSE role in the SGPLOT procedure.

Since we want to use the "PERCENT." format , we normalize the percent values in the column between 0.0 - 1.0.  Here is the SAS 9.2 graph and code.

proc freq data=sashelp.cars noprint;
tables Type / out=FreqOut;
run;
 
data freqout;
  set freqout;
  label pct='Percent';
  format pct percent.;
  pct=percent/100;
  run;
 
title 'Distribution by Type';
proc sgplot data=FreqOut;
  vbar type / response=pct datalabel;
  yaxis grid display=(nolabel);
  xaxis display=(nolabel);
  run;

For the second graph shown in Rick's article, things are a bit different with SAS 9.2.   SAS 9.3 SGPLOT procedure  supports the GROUPDISPLAY=CLUSTER option that displays the group values side-by-side.  This feature did not exist at SAS 9.2.

However, one can get a functionally similar graph using SAS 9.2, you just have to use the SGPANEL procedure.  Here is the graph created using the SAS 9.2 SGPANEL procedure, and the code.

SAS 9.2 SGPANEL code:

proc freq data=sashelp.cars;
tables Origin*Type / out=FreqOut2;
run;
 
data freqout2;
  set freqout2;
  label pct='Percent';
  format pct percent.;
  pct=percent/100;
  run;
 
title 'Distribution by Type and Origin';
proc sgpanel data=FreqOut2;
  panelby type / layout=columnlattice onepanel colheaderpos=bottom
                 noborder novarname;
  vbar origin / response=pct datalabel group=origin barwidth=1;
  rowaxis grid;
  colaxis display=none;
  run;

Note, we have used "Type" as the panel variable that creates a panel with 6 cells.  Each cell is then populated with a VBAR by "Origin".  We have moved the cell header to the bottom so it looks sort of like an axis.

Post a Comment

AE Timeline by Name

In my previous article on Adverse Event Timeline Graph, I wrote about how to create the AE timeline using SAS 9.2 code, using VECTOR plot and the MARKERCHAR option in SCATTER plot.  I  described a better way to place the labels at the lower end of the vectors.

SAS 9.3 provides an easier way to create such graphs using the HIGHLOW plot statement that supports placing labels and end caps at the low end or high end of each bar.  The plot does all the work needed to figure out the position, and so the code is very simple.

Here is the graph and the SAS9.3 SGPLOT code:

SAS 9.3 SGPLOT Code:

title "Adverse Events for Patient Id = &pid (SAS 9.3)";
proc sgplot data=AE_Cap dattrmap=attrmap;
  format aestdate date7.;
  refline 0 / axis=x lineattrs=(color=black);
  highlow y=aeseq low=aestdy high=aeendy / type=bar group=aesev barwidth=0.8
          lowlabel=aedecod lineattrs=(color=black pattern=solid) highcap=aehicap
          attrid=Severity;
  scatter y=aeseq x=aestdate / x2axis markerattrs=(size=0);
  xaxis grid display=(nolabel) offsetmax=0.02 values=(&minday2 to &maxday by 2);
  x2axis display=(nolabel)  offsetmax=0.02 values=(&mindate2 to &maxdate);
  yaxis grid display=(noticks novalues nolabel);
  run;

As you can see in the graph above, the adverse events are displayed by sequence, and multiple events of the same name  (such as DIZZINESS) are displayed independently with the name displayed.   It is not clear to me why this data set has two events for DIZZINESS and DERMATITIS for the same duration.  Clearly, the data needs cleaning.  The events that do not have an end date have a arrow cap at the top end.

A reader asked whether it was possible to show all events of the same name on one line.  The answer is yes, and we can do that by creating the graph by AEDECOD instead of AESEQ.   The only thing we have to do is avoid having the label displayed multiple times.

To do this, I create a new variable that I call AENAME.  This variable has the same values as AEDECOD for all first occurrences of the event.  For subsequent occurrences of the same name, AENAME has a missing value.  Here is the graph and the code.

SAS 9.3 SGPLOT code:

title "Adverse Events for Patient Id = &pid (SAS 9.3)";
proc sgplot data=ae_by_name dattrmap=attrmap;
  format aestdate date7.;
  refline 0 / axis=x lineattrs=(color=black);
  highlow y=aedecod low=aestdy high=aeendy / type=bar group=aesev barwidth=0.8
          lowlabel=aenamelineattrs=(color=black pattern=solid) highcap=aehicap
          attrid=Severity;
  scatter y=aedecod x=aestdate / x2axis markerattrs=(size=0);
  xaxis grid display=(nolabel)  offsetmax=0.02 values=(&minday2 to &maxday by 2);
  x2axis display=(nolabel)  offsetmax=0.02 values=(&mindate2 to &maxdate);
  yaxis grid display=(noticks novalues nolabel);
  run;

Note, now all DIZZINESS events are in one line, and only the first one is labeled.

Full SAS 9.3 code:  AETimelineByName

 

 

Post a Comment

Unicode Tick Values using GTL

Often it is desirable to use special Unicode characters for the tick value names on the axes.  However, SG procedures and GTL do not support Unicode strings in SAS data sets.

With SAS 9.3, the SGPLOT procedure supports annotation which does support Unicode strings.  You can create an annotation data set that specifies the text strings you want, and then position these in place of the tick values from the data.  See Dan's article on this topic:  Graphical Swiss Army Knife.

SAS 9.3 GTL supports many DRAW statements for adding custom annotations in the graph.  You can think of these as "inline annotation" statements.  These are designed to add custom annotations into your graph that cannot be added using plot statements.  In fact, the SG annotation function turns around and uses these statements to create the annotation.  So, there is no reason why you cannot use these statements too.

Here is an example of a graph showing response by categories.  Note the category name strings.

SAS 9.3 GTL Program:

proc template;
  define statgraph Unicode1;
    begingraph;
      entrytitle 'Response by Category';
      layout overlay / xaxisopts=(display=(ticks tickvalues)
                          discreteopts=(ticktype=inbetween))
                       yaxisopts=(griddisplay=on);
        barchart x=cat y=Response / stat=mean dataskin=gloss;
      endlayout;
    endgraph;
  end;
run;
 
proc sgrender data=unicode template=Unicode1;
run;

The category value strings are "SIGMA", "RSquare" and so on, and we would like to replace them with Unicode strings.   We will do this as follows:

  • Turn of the drawing off the tick values.
  • Add a bottom pad of 20 pixels to make room for the tick value strings.
  • Use inline DRAW statements in the GTL program to draw the Unicode strings.

Here is the graph and the GTL program:

Now, the appropriate Unicode strings are drawn for each category, including superscripts and subscripts.

SAS 9.3 GTL Code:

ods escapechar = '~';
proc template;
  define statgraph Unicode2;
    begingraph / pad=(bottom=20px) ;
      entrytitle 'Response by Category';
      layout overlay / xaxisopts=(display=(ticks)
                         discreteopts=(ticktype=inbetween))
                       yaxisopts=(griddisplay=on);
	barchart x=cat y=Response / stat=mean dataskin=gloss;
	drawtext '~{unicode "03a3"x}' / x='SIGMA' y=-1 anchor=top
                 xspace=datavalue yspace=wallpercent;
	drawtext 'r' {sup "2"} /  x='RSquare' y=-1 anchor=top
		         xspace=datavalue yspace=wallpercent;
	drawtext '~{unicode "03b1"x} + ~{unicode "03b2"x}' /
                 x='Alpha+Beta' y=-1
                 anchor=top xspace=datavalue yspace=wallpercent;
	drawtext '~{unicode "03c3"x}' {sub "1"} / x='Sigma1' y=-1
                 anchor=top xspace=datavalue yspace=wallpercent;
	drawtext '~{unicode "2264"x} 10'  /  x='LE10' y=-1 anchor=top
                 xspace=datavalue yspace=wallpercent;
      endlayout;
    endgraph;
  end;
run;
 
proc sgrender data=unicode template=Unicode2;
run;

Note, the drawing context for the text is DATAVALUE for X and WALLPERCENT for Y.  With this, the x coordinates can be specified as the actual value of the tick on the x axis.  The y coordinate is -1% (just below) the wall area, with text anchor point as TOP.

Inline DRAW statements are useful when creating a custom graph, where you can add the statements manually into the GTL code.  Or, you could use macro logic to generate the statements.

In general, is easier to use SG Annotation when many annotations are to be created from the data set.  SG Annotation is available with SAS 9.3 SG Procedures and will be available for SAS 9.4 GTL.

Full SAS 9.3 code: UnicodeTickValues

 

 

Post a Comment

Bar Chart with Target and Attribute Map

A commonly requested graph is a bar chart with response and targets.  With SAS 9.3, the SGPLOT procedure supports new "parametric" plot statements called HBARPARM and VBARPARM.  These statements are special versions of the HBAR and VBAR statements and they expect summarized data for each category or category+group combination.  Also, the HBARPARM and VBARPARM are basic plot statements, and thus they can be used along with other basic plot types like scatter, series, etc.

A targeted bar chart can be built by overlaying a parametric bar chart with a scatter plot.  Note, the data has to be summarized by you prior to use in the HBARPARM statement.  Here is the resulting graph along with the SGPLOT code.

The data used for this graph is as shown here:

SAS 9.3 SGPLOT  Code:

title 'Actual and Target Revenues by Product';
proc sgplot data=revenues noautolegend dattrmap=attrmap;
  hbarparm category=product response=actual / dataskin=gloss name='a';
  scatter y=product x=target / markerattrs=(symbol=triangleDownFilled size=10)
      name='t' legendlabel='Target' discreteoffset=-0.35 transparency=0.4;
  xaxis offsetmin=0 label='Sales (Sum)'  grid;
  yaxis display=(nolabel);
  keylegend 'a' 't' / title='';
  run;

In the data set shown above, we have computed group values as "Over performing" or not based on whether the Actual revenue exceeds the Target.  Let us use this column to color the bars.  Here is the graph and the  code:

SAS 9.3 SGPLOT code:

title 'Actual and Target Revenues by Product';
proc sgplot data=revenues noautolegend dattrmap=attrmap;
  hbarparm category=product response=actual / dataskin=gloss group=group name='a';
  scatter y=product x=target / markerattrs=(symbol=triangleDownFilled size=10)
      name='t' legendlabel='Target' discreteoffset=-0.35 transparency=0.4;
  xaxis offsetmin=0 label='Sales (Sum)'  grid;
  yaxis display=(nolabel);
  keylegend 'a' 't' / title='';
  run;

Note in the above graph, the bars are colored using the style elements GraphData1 and GraphData2.  GraphData1 is used for the first group value encountered by the program, which happens to be "Actual (Under performing)".  This group value gets the color from GraphData1, which is blue.  The other group gets the 2nd graph data style element, with the color red.

In this case this does NOT work very well since the over performing products get the red color. We could change the color assignment for the graph data elements in the style, but there is no way to ensure the correct value is encountered first in the data.

The right way to address this issue is to use the Discrete Attribute Map feature supported by the SAS 9.3 SGPLOT procedure.  In this feature, you can define the list of group values you expect in the data, and the visual attributes (such as fill color, marker symbol, etc.) for each value.  Since the visual attributes are used based on group value, not position in the data set, the results are predictable.  Here is the graph and code using an attribute map:

SAS 9.3 SGPLOT code:

/*--define discrete attributes map data set--*/
data attrmap;
  length value $25;
  ID='A'; value='Actual (Over performing)'; fillcolor='darkgreen'; output;
  ID='A'; value='Actual (Under performing)'; fillcolor='darkred'; output;
  run;
 
/*--SAS 9.3 SG Grouped Horizontal Target Bar Chart with Attr Map--*/
ods graphics / reset width=5in height=3in imagename='TargetAttrmap_SG';
title 'Actual and Target Revenues by Product';
proc sgplot data=revenues noautolegend dattrmap=attrmap;
  hbarparm category=product response=actual / dataskin=gloss group=group attrid=A
      name='a' transparency=0.3;
  scatter y=product x=target / markerattrs=(symbol=triangleDownFilled size=10)
      name='t' legendlabel='Target' discreteoffset=-0.35 transparency=0.4;
  xaxis offsetmin=0 label='Sales (Sum)';
  yaxis display=(nolabel);
  keylegend 'a' 't'/ title='';
  run;

 Full SAS 9.3 SGPLOT program:  BarTarget_SG

Post a Comment

Multiple Classifiers vs Small Multiples

Often we have the need to see the data by two different classifiers at the same time, as requested by a recent query on the SAS Communities page.

In this example I have simulated a response over time for patients by study and treatment.  We want to create series plots over time for each "id", where the color of the series represents the treatment (Blue for Drug A and Red for Drug B) and the line pattern represents the study.

Here is the graph and the SAS 9.3 GTL code.  Click on graph for bigger view.

SAS 9.3 GTL Template Code:

proc template;
  define statgraph PowerSeries;
    begingraph;
      entrytitle 'Risk by Study and Treatment';
      layout overlay / xaxisopts=(griddisplay=on)
                       yaxisopts=(griddisplay=on linearopts=(viewmin=2)
                                  offsetmax=0.1 offsetmin=0.1);
	seriesplot x=week y=risk / group=id  linecolorgroup=trt
          linepatterngroup=study lineattrs=(thickness=2) name='a';
        discretelegend 'a' / title='Trt:' type=linecolor location=inside
          halign=left valign=bottom valueattrs=(size=7) opaque=true;
        discretelegend 'a' / title='Study:' type=linepattern location=inside
            halign=right valign=bottom valueattrs=(size=7) opaque=true;
      endlayout;
    endgraph;
  end;
run;
run;

The key element here is the use of the SERIESPLOT statement that supports the options LINECOLORGROUP and LINEPATTERNGROUP  with SAS 9.3.  This feature is supported only for Series plot and was specially created for use by the POWER procedures.  The following conditions are necessary for this to work:

  • The GROUP variable is used to connect the points of a series.
  • The LINECOLORGROUP variable is used to assign the color (only).
  • The LINEPATTERNGROUP variable is used to assign the line pattern (only).
  • The GROUP variable must have the smallest granularity to ensure connectivity for each plot.
  • The DISCRETELEGEND has an option (TYPE) to draw only one aspect of the line or marker.

The upper visual is necessary when all the curves need to be compared to each other.

Full SAS 9.3 Code:  MultiGroup_93

 

Small Multiples:  If the comparisons to be made are among different treatments across the studies (independently), an alternate way is possible using the idea of "Small Multiples" as popularized by Edward Tufte.   This also helps reduce some of the clutter.

Here is the graph and SAS 9.2 SGPANEL Procedure code.

Note, in this graph, each study is shown in its own cell.   The data and Y data range for each cell comes only from the data that falls in that cell.   Comparison of data between studies is harder as the Y overlap between Study 1 & 2 is not seen.

SAS 9.2 SGPANEL Procedure code:

title 'Risk by Study and Treatment';
proc sgpanel data=study;
  panelby study / layout=rowlattice onepanel novarname uniscale=column spacing=20;
  series x=week y=risk / group=trt lineattrs=(thickness=2);
  rowaxis grid;
  colaxis grid;
  run;

Full SAS 9.2 code:  SmallMultiples_92

Post a Comment

Forest Plot with SAS 9.3

OK, I promise this is the last article on Forest Plots (at least for a while).

In the previous article on Subgrouped Forest Plot with Font Attributes, I discussed how to use bold text for subgroup headings.  I mentioned that increasing the font size would not work as it would misalign the subgroup values from the headings.

Clearly, the result shown in the link above is less than ideal, and I have not yet been able to come up with a better way to do this using SAS 9.2.   If you have a better solution, I am sure we all would be very interested to hear of it.

But with SAS 9.3, we can do better because we can use the new HighLowPlot statement.  This statement supports a LowLabel and/or HighLabel.  In this case, the label is drawn starting from the appropriate end of the plot line.  So, all the contortions needed for the MarkerCharacter solution with SAS 9.2 above can be avoided.

Forest Plot using SAS 9.3 HighLowPlot.  Here is the graph.  Click on it for a bigger view:

Note the following improvements in this graph:

  • The subgroup heading and values use the same font family as the rest of the columns.  It is not necessary to use a non-proportional font.
  • The subgroup headings use a bigger font size and bold weight.
  • The subgroup values are indented.
  • The error bars do not have serifs.

In this graph, we have used the HighLowPlot to draw the subgroup headings and values.   The actual drawing of the high low line itself is hidden by using line thickness of zero.  The other text strings are still drawn using ScatterPlot (with MarkerCharacter), but that too can be changed to HighLowPlot if we need left or right aligned strings.   The error bars for the Hazard Ratio are also drawn using the HighLowPlot.

With SAS9.3, the ScatterPlot itself also has a new option to draw DataLabels with DataLabelPosition of LEFT | RIGHT, etc.  So, we could have used that for these labels.  But there are some interesting interactions with axis offsets that need to be considered.  So, at this time, HighLowPlot is preferred.

GTL code snippet for the subgroup labels:

highlowplot y=obsid low=zero high=zero / highlabel=heading
    lineattrs=(thickness=0) labelattrs=(size=7 weight=bold);
highlowplot y=obsid low=zero high=one / highlabel=subgroup
    lineattrs=(thickness=0);

Forest Plot without horizontal bands:

Full SAS 9.3 Code:  ForestPlot_93

Post a Comment

Subgrouped Forest Plot with Font Attributes

Just a few days ago our "super-duper tech support trooper" called in asking for the link to the subgrouped Forest Plot with bold headings.  She was referring to this Forest Plot with Subgroups  I had posted earlier.  However, as you can see, while the subgroup values are indented from the subgroup headers, the headers do not have bold fonts.

Well, I happily informed her that we had anticipated just such a need, and we have built into SAS 9.4 a new statement (AXISTABLE) which will make subgroup indentations and font attribute control much easier.  Surprisingly, she was not as impressed with this information as I had thought, asking instead "What can you do for me now?".   Some people are so hard to please...

I promised her I will take another look, and so I did.  Looking back over the code for the graph, we see that both the subgroup headers and values are in the same column drawn by one scatter plot statement.   So they all can have only one setting for text attributes.

This led to the idea that if we want the headings to have bold attributes, they will need to be in a separate column from the values.  Then, we can use two separate scatter plot statements to overlay the strings with different text attributes.  Of course, only one of these two columns will have a non-missing string per observation.

Here is the result.  Click on graph to see bigger version:

Here is the GTL code snippet that makes this possible:

  scatterplot y=obsid x=zero / markercharacter=subgroup
              markercharacterattrs=(family='Lucida Console' size=6);
  scatterplot y=obsid x=zero / markercharacter=heading
              markercharacterattrs=(family='Lucida Console' size=6
              weight=bold);

Note, in the code above, the second scatter plot uses MARKERCHARACTER=HEADING and WEIGHT=BOLD.  The full code is attached at the end of the post.

Could we also make the font for the headings bigger?  Unfortunately, not with this technique as that will cause the two strings to be misaligned.  But, it is possible to change the color of the headers to provide a bit more of a distinction:

One other possibility is to use the "ID" column as a group variable for the single scatter plot statement, thus causing the headers and values to be drawn using two group attributes.  The results are less than satisfactory.

These kind of "creative" usage of the plot statements to draw text strings that are aligned with plot data in the graph led to development of the new AXISTABLE statement for SAS 9.4.  This statement is designed to make such graphs easier.   We are very excited with the possibilities that this statement opens up and we will write up many articles using AXISTABLE as we get closer to the release date for SAS 9.4.

Full SAS 9.2 Program:  ForestPlot_92

Post a Comment

Sorting out your panelled graphs (part 2)

In this final post for 2012, I would like to finish up the panel sorting topic with a discussion on sorting the panel cells by statistic. With this sort, the response or dependent data in each cell is calculated down to a single statistic value (mean or median, for example). These values are used to sort the cells in ascending or descending order. There are two primary benefits of this type of sort:

  1. The person reading the graph can quickly determine the highest and lowest classfication values.
  2. When used in conjunction with axis sorting, abnormalities in the data are easier to spot

T0 demonstrate both of these points, I am going to create a panelled dot plot using a classic data set involving barley yields in Minnesota (http://stat.ethz.ch/R-manual/R-devel/library/lattice/html/barley.html). In the example below, I subsetted the data set to the year 1931 for clarity. Notice that the cells are displayed in descending mean order and that the response values are sorted in descending order to add further clarity (this option is available in SAS 9.3 and greater). Also notice that the "Trebi" variety of barley at the "Morris" site seems to stand out as an odd data point, warranting further investigation. Without this sorting, that anomaly could have easily been missed. 

The first step to create this example is to determine the correct order for the class values. The SUMMARY procedure is used to do the actual mean calculations. Then, the RANK procedure is used to create a rank column based on the mean values. If you want the cells in descending order (as in this example), be sure to use the DESCENDING option on PROC RANK to rank the values from high to low.

proc summary data=t.barley nway;
where year=1931;
class site;
var yield;
output out=cellstats mean=cell_means;
run;
 
proc rank data=cellstats descending out=rankings;
var cell_means;
ranks order;
run;

The next step requires that you create a format that associates the rank value back to the class value (similar to what we did in part 1).

proc format;
value mean_order 2 = "Crookston" 
                 4 = "Duluth" 
                 6 = "Grand Rapids" 
                 5 = "Morris" 
                 3 = "University Farm" 
                 1 = "Waseca"
;

Next, you match-merge the rank dataset with the original data to create a merged data set that we will use with PROC SGPANEL.

proc sort data=rankings; by site; run;
proc sort data=t.barley (where=(year=1931)) out=barley; by site; run;
data merged;
merge barley rankings;
by site;
run;
proc sort data=merged; by order; run;

Finally, the SGPANEL procedures renders the graph. Notice that the rank column "order" is used on the PANELBY statement. The user-defined format associated with that column turns the rank values back into the original class values.

ods graphics / height=1300px width=480px;
title "Minnesota Barley Yields";
title2 "where year=1931";
proc sgpanel data=merged;
format order mean_order.;
panelby order / layout=rowlattice uniscale=column onepanel novarname;
dot variety / response=yield categoryorder=respdesc;
run;

As mentioned in the previous post, the new SORT option on the PANELBY statement in PROC SGPANEL for SAS 9.4 will make this example much easier to create. In fact, all that will be needed for this example in SAS 9.4 will be to run PROC SGPANEL.

Happy new year, everyone!

Post a Comment