Graph size for presentations

At SAS Global Forum, and again at PharmaSUG, we had the pleasure of attending many papers and presentations on various topics that included graphs in the power point decks or in the papers.   More often than not, the graphs exist along other text, and occupy a smaller part of the screen or page.   These presentations include graphs created by ODS graphics, either as automatic graphs from Base and STAT procedures, or by SG procedures.

Often, the natural inclination is to fit the default sized graphs created by the procedure into a smaller space in the presentation.  All graphs (with very few exceptions) created by ODS Graphics have a size of 640 x 480 pixels.  This is the same as 6.66 x 5 inches at the default 96 dpi.   When a graph like this is inserted into a smaller space in a Powerpoint slide or in one column of a Word doc, the graph is shrunk to fit the space.  In such cases, a default size graph would look like this:

The above graph is captured from the screen from a Word document.   When a 6.66 inch wide graph is shrunk to fit a 3.25 in wide space, everything is scaled down by a factor of (3.25 / 6.66) = 0.488.  As we can see, all the elements of the graph are scaled down, including the text so an 8pt. font is now displayed effectively at 4pt. which is quite unreadable to many eyes.  A similar effect is can also be seen to varying degrees in various papers presented at SGF 2012.

My solution is to render the graph at its expected size on the page using a high dpi.  In a two column document, each column is about 3.25 inches.  So, I render the graph with a width of 3.25 inches at 300 dpi.  When this graph is inserted into a 3.25 wide region, we get this:

Here are the options you need to get this :

ods listing image_dpi=300;
ods graphics / width=3.25in;

As you can see, now the graph fits "as is" in the space, and the text elements are clearly readable.  This effect is by design, and not by accident.    Internally, the default "design" size of all ODS Graphs is 640 x 480 pixels.  At the default dpi of 96, this translates to a graph of about 6.66 x 5 inches. However, when this graph is rendered at a smaller "render" size of 3.25 inches, all elements of the graph are  scaled by a non-linear factor = (3.25 / 6.66) ** 0.25 = 0.83.  This is done by the procedure, so now the 8pt font is rendered at about 6.6pt, which is much more readable.  The DPI value of the image is inserted into the PNG file, so Powerpoint and Word know how to display the image correctly.

Here is another example of a QTc graph, inserted into a 3.25 inch wide space, with a graph rendered using default settings, and another graph rendered with a width of 3.25 inches at 300 dpi.

Default rendering:

Rendered with width=3.25in and 300 dpi: 

Note:  There are two types of scaling going on in this situation.

  1. Linear image scaling is done by Word to fit an image into the space available.
  2. Non linear graph scaling is done by ODS to render a design-size graph into render-size image.

In the examples above, the 1st case uses only image scaling to squeeze a 6.66 inch graph into a 3.25 inch space.  In the 2nd case, we use only graph scaling as we rendered the graph to the right width of 3.25 inches, so no image scaling was applied.

Sometimes, when a graph is very busy, using a width of 3.25 inches can make the fonts relatively too big.  You can use a combination of the image scaling and graph scaling to achieve an intermediate result.  If you render the graph to a width of 4 inches, and then add it to a 3.25 inch space in the doc, here is what you get:

A 4 in wide graph at 300 dpi in 3.25 inch space:

In this case the font sizes are somewhere between the first two examples, and imho, well balanced.  You can use this technique to fine tune the graph for maximum quality and readability.  These graphs are rendered using SAS 9.3, but this scaling technique also applies to SAS 9.2 graphs.

Full SAS code: Graph_Size_SAS93

Post a Comment

PharmaSUG 2012 update

PharmaSUG 2012 conference drew to a close today, concluding two and a half days packed with papers, presentations, posters, hands-on demos and super demos by SAS staff.  While the weather outside was a bit chilly from time to time, the conference what hopping with many user papers on how to use graph procedures to create innovative graphs.

It was good to see that a large majority of users have now moved to SAS 9.2 or later release, with over 25% using SAS 9.3. ODS Graphics, SG Procedures, GTL and Desigener are now mainstream. Some interesting items were as follows:

Poster on Designer wins 1st prize.   Kevin Lee of Cytel presented a poster titled "Graphs Made Easy with ODS Graphics Designer".  He showed ways to create graphs with Designer, where he created the graphs interactively, and then included the GTL code generated by Designer as part of the documented process.

Personalized Risk Graphs using KPI:   Janet Gruber investigated the usage of different KPI graphs to visualize individual risk of developing Type-2 Diabetes.  Clinicians preferred the usage of vertical thermometer KPIs.  Janet and her team created their KPI graphs using the GKPI procedure.   See my article on Dashboard Graphs using using GTL.

Water Fall Charts for Oncology Trials:  Niraj Pandya shows how to use waterfall charts to plot tumor sizes.  This graph was created using SGPLOT VBAR statement, with innovative usage of an overlaid VLINE statement to add annotations (marker shapes) to the top of some of the observations.

Dr. Mat Soukup and Dr. Stephen Wilson of FDA presented the opening session keynote address on "Doing IT Better".  Mat stressed the view that graphical presentation of the voluminous data was key to easier understanding of the findings by the reviewers.  Mat has been involved in a project to define a set of graphs suitable for the analysis of safety data, and these graphs are available on the CTS repository website.  It is good to see that many graphs created using ODS Graphics shown on the SAS Support Clinical Graphs web page are included here.

Patient Profile Graphs:  There was keen interest by the audience in Patient Profile graphs, Forest Plots, Adverse Event Graphs and more.  I forwarded copies of Dan O'Connor's Patient Profile Macro to quite a few users.

I presented a 1/2 day seminar on Clinical Graphs using SG Procedures that was well attended.  The audience was very animated, with lots of questions.  The Super Demos  on Clinical Graphs, SG Procedures, GTL and Designer were very popular.

Post a Comment

'Unbox' Your Box Plots

At the 2012 SAS Global Forum, one of the questions from a user was about showing the original data used for the box plot. While you can use outliers in conjunction with the box features to get a feel for the data, for some situations you may need to see exactly what the data looks like, in relation to the boxes.

Since there could be many data points for a given categorical value, some sort of jittering would be needed to “un-clump” the points. Shown below is what you get by overlaying the raw data and the box plot without any jitter:

Raw Data with Box overlays (No Jitter)

Raw Data with Box overlays (No Jitter)

One solution for this problem is given in “The Graph Template Language: Beyond the SAS/GRAPH® Procedures” by J. M. Pratt. This approach uses a categorical X axis along with a numeric X2 axis that is not displayed.

Starting with SAS 9.3, there is another way to solve this problem. We now support box plots on interval axis! Here is what this solution looks like:

Raw data (with jitter) overlaid with Box

Raw data (with jitter) overlaid with Box

The full program is available here.  Here are the main points in this solution:

  • Map your categories to a numeric variable.
  • Turn off the display of box outliers – we will be showing all points including the outliers.
  • Introduce a small displacement in the X coordinates of the scatter points to reduce collisions.  Ideally, this would be a true jitter which takes the degree of collision of the points into account. Here, we make do with a simpler method of adding random noise to every X coordinate.
  • Explicitly specify the X axis tick values and map them to the original category values using a format.
  • Making the scatter points slightly transparent prevents the points from overpowering the box features (mean, for example). This also helps us notice any overlapping points in spite of added the random noise.
  • The above output has the box overlaid on the scatter points. If you choose to overlay the scatter points on top of the box, you also need to force the X axis to be of type linear.

So when you need to explode your boxes, remember this trick!

Post a Comment

Graphs with log axis

Recently I posted an article on this blog on how to create bar charts with log response axes in response to a question by a user.  This generated some feedback suggesting that bar charts should not be used with log response axes or with a baseline of anything other than zero.  John Munoz suggested there may be other ways to better represent the users data.

My initial goal was purely to see how such a graph could be  created using SAS software.   Following up on John's comments, I contacted the user to see what his exact use case is and why he wants to use a bar chart.  Turns out, they do need an odds ratio plot, but usage of a dot plot was not showing the data with enough clarity in the opinion of the PI.  So, they wanted to try out a bar chart or needle plot.

This user sent me sample data, and here is what the bar chart looks like along with the code.  I added the bar labels to indicate values to compensate for the log axis:

SAS 9.3 Code:

title 'OE Breast Cancer Stages by Ethnicity';
proc sgplot data=oeb_grp;
  format stage 4.1;
  highlow x=type low=min high=stage /group=ethnicity groupdisplay=cluster
          type=bar highlabel=stage clusterwidth=0.6 lineattrs=(color=grey pattern=solid);
  yaxis type=log logbase=2 max=4 offsetmin=0 grid display=(nolabel);
  xaxis display=(nolabel noticks);
  keylegend / location=inside position=topleft across=1 noborder;
  run;

In the above example, we have used the HighLow statement, with the low value set to 1.  It is also possible to use the vertical bar chart in GTL, set BASELINE=0.1  and yaxis viewmin=0.1 to create the same plot.  We could also make this a horizontal bar chart.

Since the bar chart strongly suggests the association of bar length to data value, the argument is that using a log transform, or a baseline other than zero may misrepresent the data.   Some opinions seemed accept a log axis as long as the usage was very clear.  It was also suggested that a dot plot may be more appropriate for such a plot with log axis as we are effectively looking at positions of the markers, and not the lengths of the bars.

So, I investigated further to see how we could effectively represent such data as a Dot Plot.  The SAS 9.3 SGPLOT does not support cluster grouping of dot plots on the Y axis.  While this has been addressed for SAS 9.4, I reshaped the data into a multi-response and used discrete offset to make this plot:

Basic Dot Plot with log axis:

SAS 9.3 Code:

title 'OE Breast Cancer Stages by Ethnicity';
proc sgplot data=oeb_multi;
  format white black hispanic 4.1;
  dot type / response=white discreteoffset=-0.2 nostatlabel;
  dot type / response=black discreteoffset= 0.0 nostatlabel;
  dot type / response=hispanic discreteoffset= 0.2 nostatlabel;
  xaxis type=log logbase=2 logstyle=linear min=1 max=4 grid display=(nolabel);
  yaxis display=(nolabel noticks) offsetmin=0.2 offsetmax=0.2;
  keylegend / title='Ethnicity:';
  run;

I believe one can see the concern expressed by the PI about the clarity of the display.  Not only are the default markers small, the clusters seem to blend together as they are so far away from the axis.  So, I tried some alternatives to "improve" the visual representations.  The three different alternatives along with code are shown below.  I would love to hear your comments.

Bolder markers with category bands:  The horizontal bands help to cluster the markers that belong together in one group.

SAS 9.3 Code:

title 'OE Breast Cancer Stages by Ethnicity';
proc sgplot data=oeb_multi;
  format white black hispanic 4.1;
  refline ref / lineattrs=(thickness=60 color=lightgray) transparency=0.6;
  dot type / response=white discreteoffset=-0.2 nostatlabel
             markerattrs=(symbol=circlefilled size=11) ;
  dot type / response=black discreteoffset= 0.0 nostatlabel
             markerattrs=(symbol=circlefilled size=11);
  dot type / response=hispanic discreteoffset= 0.2 nostatlabel
             markerattrs=(symbol=circlefilled size=11);
  xaxis type=log logbase=2 logstyle=linear min=1 max=4.1 grid display=(nolabel)
        offsetmin=0 offsetmax=0;
  yaxis display=(nolabel noticks) offsetmin=0.2 offsetmax=0.2;
  keylegend / title='Ethnicity:';
  run;

Dot plot with faded needles:  The needles may help the eye and their faint rendering may avoid a strong association with length (opinions?).  We used high low plot for the needles.  The dot plot  does not allow overlay of other basic plots, so we used Scatter to draw the markers.  Note:  Changing from Dot to Scatter also "unreversed" the Y axis so the relative positions of the markers have changed.

SAS 9.3 Code:

title 'OE Breast Cancer Stages by Ethnicity';
proc sgplot data=oeb_multi nocycleattrs;
  format white black hispanic 4.1;
  refline ref / lineattrs=(thickness=60 color=lightgray) transparency=0.6;
  highlow y=type low=min high=white / type=line discreteoffset=-0.2
          lineattrs=(color=lightgray pattern=solid);
  scatter y=type x=white / discreteoffset=-0.2 name='w' legendlabel='White'
             markerattrs=graphdata1(symbol=circlefilled size=11) ;
  highlow y=type low=min high=black / type=line discreteoffset= 0.0
          lineattrs=(color=lightgray pattern=solid);
  scatter y=type x=black / discreteoffset= 0.0 name='b' legendlabel='Black'
             markerattrs=graphdata2(symbol=circlefilled size=11);
  highlow y=type low=min high=hispanic / type=line discreteoffset= 0.2
          lineattrs=(color=lightgray pattern=solid);
  scatter y=type x=hispanic / discreteoffset= 0.2 name='h' legendlabel='Hispanic'
             markerattrs=graphdata3(symbol=circlefilled size=11);
  xaxis type=log logbase=2 logstyle=linear min=1 max=4.1 grid display=(nolabel)
        offsetmin=0 offsetmax=0;
  yaxis display=(nolabel noticks) offsetmin=0.2 offsetmax=0.2;
  keylegend 'w' 'b' 'h' / title='Ethnicity:';
  run;

Dot Plot with Class Labels:  Personally, I like direct labeling of curves and points whenever possible to avoid having to always look at the legend to decode the colors.   This usually works well for curves, but may also work here with sparse data.  Now we can do away with the legend.

SAS 9.3 Code:

title 'OE Breast Cancer Stages by Ethnicity';
proc sgplot data=oeb_multi nocycleattrs noautolegend;
  format white black hispanic 4.1;
  refline ref / lineattrs=(thickness=60 color=lightgray) transparency=0.6;
  highlow y=type low=min high=white / type=line discreteoffset=-0.2 highlabel=whitelabel
          lineattrs=(color=lightgray pattern=solid);
  scatter y=type x=white / discreteoffset=-0.2 name='w' legendlabel='White'
             markerattrs=graphdata1(symbol=circlefilled size=11);
  highlow y=type low=min high=black / type=line discreteoffset= 0.0 highlabel=blacklabel
          lineattrs=(color=lightgray pattern=solid);
  scatter y=type x=black / discreteoffset= 0.0 name='b' legendlabel='Black'
             markerattrs=graphdata2(symbol=circlefilled size=11);
  highlow y=type low=min high=hispanic / type=line discreteoffset= 0.2 highlabel=hispaniclabel
          lineattrs=(color=lightgray pattern=solid);
  scatter y=type x=hispanic / discreteoffset= 0.2 name='h' legendlabel='Hispanic'
             markerattrs=graphdata3(symbol=circlefilled size=11);
  xaxis type=log logbase=2 logstyle=linear min=1 max=4.1 grid display=(nolabel)
        offsetmin=0 offsetmax=0;
  yaxis display=(nolabel noticks) offsetmin=0.2 offsetmax=0.2;
  run;

We could label the actual response value instead of the class values.  I think class labels help in decoding of the data, while the positions of the markers indicate the values just fine.  As I said earlier in the article, I would be happy to hear opinions on these alternatives.

Full SAS 9.3 code:  DotPlot_V93

Post a Comment

Axis values and hint

Getting the axis values just right generally requires some work, and the values you want can change from case to case.  One such example was discussed by Dan Heath in his post on custom axis values.  Here Dan shows the usage of non uniform axis values using the VALUES option on the axis statements.

Another usage is when you want specific values on the axis, but only within the actual range in the data set.  Say I want to make monthly graphs of average daily temperature in Albany, NY as referenced in the article on polar graphs.  The graph has the average daily temperatures for only one month.   I want the Y axis to always use round values with an increments of 5 degrees, but only span the data for that month.

By default, I will get different increments based in the range of the data for each month.  You could specify the values, but then these values will determine the range of the data on the axis.  There is no way to just specify "by=5" in the values option.

To get specific axis values you want, but only within the current data range, you can use the VALUESHINT option.  The graphs below show the average daily temperatures for the months of September and October 2011.  Note the Y axis values on each graph.

September 2011:

October 2011:

Code:

title 'Daily Average Temperature for October 2011';
proc sgplot data=temp2011;
  where month=10;
  series x=day y=temp / lineattrs=(thickness=2);
  yaxis grid values=(-20 to 100 by 5) valueshint;
  xaxis display=(nolabel);
  run;

In both programs, we use the identical setting for the YAXIS values which cover the whole expected range of temperature data along with the VALUESHINT option.  This allows each axis to have nice round tick values with an increment of 5, but only those that fall in the data range.   If we did not provide the VALUESHINT option, both axes would be forced to a range of -20 to 100.  This feature can be very useful when using the BY statement.

VALUESHINT also works with a list of values.  In this case, we always want the axis to be in increments of 10, but with the value of 32 included if appropriate as shown below.

Code:

title 'Daily Average Temperature for January 2011';
proc sgplot data=temp2011;
  where month=2;
  series x=day y=temp / lineattrs=(thickness=2);
  yaxis grid values=(-20 -10 0 10 20 32 40 50 60 70 80 90 100) valueshint;
  xaxis display=(nolabel);
  run;

Now, we get a Y axis with the values as listed, but only those that fall within the range of the data.

Here is another example, where we are plotting (simulated) data for weight by age.  On the age axis, we want the specified age values and gridlines, but only within the actual range in the data.

Code:

title 'Weight by Age';
proc sgplot data=age;
  scatter x=age y=weight;
  xaxis grid values=(6 9 13 16 19 21 25 30 40 50) valueshint;
  yaxis grid;
  run;

Use the axis VALUES and VALUESHINT options to customize the axis to your needs.

Full SAS 9.2 Code:   TickValueHint

Post a Comment

Bar chart with log response axis

Creating bar charts with log response axis has come up a few times in the past few days.  Before we look into how we could do this, it would be worth pointing out the considerable opinion in the blogosphere against use of log response axes for bar charts.  See BizIntelGuru and here.

Both GTL and SG procedures do not support log response axes for Bar Charts.  Both include the zero value on the response axis for a bar chart.  But when all data is positive, could this be possible?

I tried setting the min value on the axis to a value > zero, and then set the TYPE=log option (in SGPOLOT).  No luck.  Neither VBar nor Needle statement did not allow the usage of a log axis in this case.

With SAS 9.3, there is a way out using the HighLow statement.  Of course, we summarize the data using proc Means, and then we can use the HighLow bar with Type=bar to get the following graph:

Bar Chart with Log Response Axis:

 

SAS 9.3 code:

title 'Log of Mean Horsepower by Type';
proc sgplot data=carsmean2;
  format mean 4.0;
  highlow x=type low=zero high=mean / type=bar highlabel=mean;
  yaxis type=log max=1000 offsetmin=0 label='Log of Mean' grid;
  xaxis display=(nolabel);
  run;

To use HighLow in proc SGPLOT, we need a variable with a small value to represent the lower end of the bar segment.  Since the statement does not force a zero value on the axis, now we can specify Type=log on the yaxis statement.  Note, I have specified max=1000 on the yaxis just to get a feel for all the log values in this case.  HighLow plot can look like a needle or a bar, and I have set Type=Bar.  We also added a bar label at the top using the HighLabel option.

If you really need to use log response axis on a bar chart, it could be done as shown above.  But it would be worthwhile to consider if it should be done.

Full SAS 9.3 code:    Bar_With_Log_Axis

Post a Comment

SAS Global Forum Monday update

On Friday before the conference, I presented a 1 day "developer led" seminar on SG Procedures and GTL, along with a discussion of new features for SAS 9.3.  The experience was very gratifying as all users were now using SAS 9.2, and some were using SAS 9.3.  We had a lively session, with very high level participation by the attendees discussing SAS 9.2 and SAS 9.3 features.

At 2pm today (Monday), I presented some key noteworthy new SAS 9.3 features in SG Procedures at the "Tales of SAS 9.3: A Collection of Users" led by David Shamlin.  I was honored to be in the company of some well known SAS names such as Rick Langston, Scott Huntley, Howard Plemmons, Jason Secosky and Margaret Crevar.  I discussed new plot options, new plot statements, Annotate and Attribute Maps.

Let us start with new options:

Box Plots with Groups:

SAS 9.3 Code:

title 'Cholesterol by Death Cause';
proc sgplot data=heart;
  vbox cholesterol / category=deathcause group=sex clusterwidth=0.5;
  xaxis display=(nolabel);
  run;
  run;

Box plot on Interval Axis:

SAS 9.3 Code:

title '2009 Lab Results by Time';
proc sgplot data=intervalBoxGroup;
  format date monname3.;
  vbox response / category=date;
  xaxis type=time display=(nolabel) values=('01jan09'd to '01dec09'd by month);
  yaxis grid display=(nolabel);
  run;

Box Plot on Interval Axis with Cluster Groups:

SAS 9.3 code:

title '2009 Lab Results by Time and Treatment';
proc sgplot data=intervalBoxGroup;
  format date monname3.;
  vbox response / category=date group=drug groupdisplay=cluster;
  xaxis type=linear display=(nolabel) values=('01jan09'd to '01dec09'd  by month);
  yaxis grid display=(nolabel);
  run;

Scatter and Series Plot with Cluster Groups:

SAS 9.3 code:

title 'Mean of QTc Change from Baseline';
proc sgplot data=QTc_Mean_Group;
  format week qtcmean.;
  scatter x=week y=mean / yerrorupper=high yerrorlower=low group=drug
          groupdisplay=cluster clusterwidth=0.5 markerattrs=(size=7 symbol=circlefilled);
  series x=week y=mean / group=drug groupdisplay=cluster clusterwidth=0.5;
  refline 26 / axis=x;
  refline 0  / axis=y lineattrs=(pattern=shortdash);
  xaxis type=linear values=(1 2 4 8 12 16 20 24 28) max=29 valueshint;
  yaxis label='Mean change (msec)' values=(-6 to 3 by 1);
  run;

 Full SAS 9.3 Code:  NewPlotFeatures

Post a Comment

Just a legend, please

Recently, an interesting question was posed on the previous article on this blog by a reader.  Can we use the new DiscreteAttrMap feature to create just a legend with specific entries, with no graph.  The question was intriguing enough that I did not wait to ask - "Why?".  I just got busy coding just to see if it can be done.

Turns out, yes you can do this using GTL.  In the example below, I have defined a DiscreteAttrMap with three values, each having a Line and a Marker representation.  The DiscreteLegend uses the DiscreteAttrmap only.   However,  there must be at least one plot statement for output to be created.   So, we have added a simple scatter plot and made the markers of zero size.  Also, we have to prevent the drawing of the axes and the wall.

Here is the template for this case.

V9.3 GTL Code:

proc template;
  define statgraph DiscreteAttrMapOnly;
    dynamic _type;
    begingraph;
      discreteattrmap name="Product";
        value "Chairs" / markerattrs=(symbol=circlefilled color=blue) 
                            lineattrs=(color=blue thickness=2);
        value "Tables" / markerattrs=(symbol=trianglefilled color=green) 
                            lineattrs=(color=green thickness=2); 
        value "Lamps"  / markerattrs=(symbol=squarefilled color=red) 
                            lineattrs=(color=red thickness=2);    
      enddiscreteattrmap;
 
      layout overlay / walldisplay=none yaxisopts=(display=none)
                       xaxisopts=(display=none);
        scatterplot x=height y=weight / markerattrs=(size=0);
        discretelegend "Product" / type=_type across=3 location=inside 
                       halign=center valign=center displayclipped=true;
      endlayout;
    endgraph;
  end;
run;

Running this template with _type='Marker' creates the marker legend.  Code and output are shown below:

ods graphics / reset noscale noborder maxlegendarea=100 
               width=4in height=0.6in imagename='DiscreteAttrMapOnly_Marker';
proc sgrender data=sashelp.class template=DiscreteAttrMapOnly;
dynamic _type='Marker';
run;
run;

Running this template with _type='Line' creates the line legend.  Code and output are shown below:

ods graphics / reset noscale noborder maxlegendarea=100 
               width=4in height=0.6in imagename='DiscreteAttrMapOnly_Line';
proc sgrender data=sashelp.class template=DiscreteAttrMapOnly;
dynamic _type='Line';
run;
run;

Note, the DiscreteAttrMap can only be used in the DiscreteLegend with TYPE is specified.  So, we can get just lines or just markers in the legend, but not both at the same time.

I believe the user wanted to do this to separate the legend from the graph, and just place the legend in a separate place.  Regardless the actual usage, it was an interesting exercise.

It is likely, you could also do this by defining stand alone legend entries and adding these into a DiscreteLegend.

Full SAS 9.3 Code:  DiscreteAttrMapLegendOnly

Post a Comment

Hi ho, hi ho, its off to SAS Global Forum we go

SAS Global Forum 2012 at Orlando, Florida is just round the corner and we are excited to see so many presentations offered by users on SG procedures and GTL.   We'll add a few more on new SAS 9.3 features of SG procedures and GTL.   These include cluster groups for discrete and interval axes, cluster grouped box plots on discrete and interval axes, bubble plot, waterfall chart, attribute maps, annotation and much more.

I will kick off the festivities with a pre-SGF "developer led" 1-day seminar on Friday, April 20, on new features in SG  procedures and GTL features.   Prashant Hebbar will present a paper on Tuesday on using GTL in unexpected ways to create graphs that go beyond the original intent.   Also on Tuesday morning, I will lead a hands-on-workshop on using ODS Graphics Designer for fast graphs.  If you have never used Designer, come on by and see for yourself how this incredible application works (OK, so I am a bit biased :-) ).  Additionally, Dan Heath, Prashant, Scott and I will present multiple super demos on these topics.

If you are at SAS Global Forum 2012, be sure to come by the Data Visualization station in the SAS demo room  where we will be on hand to show you all the cool features.  We look forward to hear directly from you about your use cases and your needs for graphics.

And yes, I can hear you thinking "Where's the Graph?"  Believe you me, we'll have plenty to share from SGF next week.  In the meantime, please see my article Graphs are easy with SAS 9.3.

See y'all in Orlando.

Post a Comment

Bar chart on interval axis

Recently, a user asked about creating a Bar Chart of Value by Date, where the dates are displayed on a scaled interval axis.   Consider this simulated data set of value by date and treatment shown below.  This data set only has one value for each date and treatment combination.

We can use the VBAR statement in SGPLOT procedure to create a bar chart of Value by Date and Treatment.  The VBAR statement always treats the category variable as discrete.  With SAS 9.3 SGPLOT, we get the following graph:

SAS 9.3 SGPLOT code:

title 'Lab Values by Time and Treatment';
footnote j=l 'Bar Chart on Discrete Axis';
proc sgplot data=Lab_Trt;
  refline 1 1.5 2 / lineattrs=graphgridlines;
  vbar date / response=value group=Drug groupdisplay=cluster barwidth=0.9;
  xaxis discreteorder=data display=(nolabel);
  yaxis label='Value (/ULN)';
  run;

We have used the VBAR statement with Date as the required variable, Response=Value, Group=Drug and GroupDisplay=Cluster to get the side-by-side placement of the bars for each drug.  Note, we have intentionally set BarWidth=0.9 to get a small gap between the treatment values.  As expected, the dates are positioned as discrete values, and the time scaling is lost.

How can we create a bar chart where the X axis values are displayed on a scale of time, and not as discrete values?   With SAS 9.3, you can use the Needle plot, which will treat the X axis as interval data.  Setting line thickness=5 pixels, we get this graph:

SAS 9.3 SGPLOT code:

title 'Lab Values by Time and Treatment';
footnote j=l 'Bar Chart on Discrete Axis';
proc sgplot data=Lab_Trt;
  refline 1 1.5 2 / lineattrs=graphgridlines;
  needle x=date y=value / group=Drug groupdisplay=cluster lineattrs=(thickness=5);
  xaxis discreteorder=data display=(nolabel);
  yaxis label='Value (/ULN)';
  run;

In this graph, we have used the Needle plot with X=Date, Y=Value, Group=Drug, and GroupDisplay=Cluster.  This creates the above graph, where the X axis is displayed as a scaled time axis, and each treatment is displayed side-by-side.  The default needle thickness in 1 pixel, so we have to guess at a good thickness value.  In this case, 5 pixels seems to work, however, this is just a guess, and not scalable.

Another way is to use the new HighLow plot included with the SAS 9.3 release of the SGPLOT procedure.  Here is the graph:

SAS 9.3 SGPLOT code:

title 'Lab Values by Time and Treatment';
footnote j=l 'HighLow plot on Interval Axis';
proc sgplot data=Lab_Trt;
  refline 1 1.5 2 / lineattrs=graphgridlines; ;
  highlow x=date high=value low=zero / type=bar group=Drug
          groupdisplay=cluster lineattrs=(color=black);
  xaxis discreteorder=data display=(nolabel);
  yaxis label='Value (/ULN)' offsetmin=0;
  run;

In the code above, we have used the HighLow plot and set the X=Date, High=Value, Low=Zero, a variable in the data that has zero value.  We have set Group=Drug, GroupDisplay=Cluster and Type=Bar.  All together, this creates a bar chart we are looking for.

The benefits of using the HighLow plot instead of Needle are:

  • Each bar looks like a "bar" with filled and outline color.
  • The width of each bar is automatically computed based on the minimum distance between the bars.

Full SAS 9.3 code:  Full SAS 93 SG Code

 

Post a Comment