Polar Graph

There are many situations where it is beneficial to display the data using a polar graph.  Often your data may contain directional information.  Or, the data may be cyclic in nature, with information over time by weeks, or years.  The simple solution is to display the directional or time data on the X axis of a XY plot as shown further down in the article.  But the information may not very easy to understand in such a graph.  Such a graph was recently discussed on the SAS Communities page.

Wind_Graph_Polar_SG2The same information can be better understood using a polar graph.  The graph on the right shows the (simulated) BC concentration by Wind Direction and Wind Speed.  The data for this graph was simulated by me using some random and trigonometric functions.  There is no real sampled or measured information in the data.

The data has the concentration of BC by wind direction and Wind speed.  In this graph I have transformed the data so that the direction (0-360 degrees) is mapped to a point around the circle, and the speed (0-25 mph) is mapped along the radius.  The BC concentration is displayed by a colored marker, and the gradient legend is displayed on the right to decode the values.

The directions are displayed using the N-S-E-W arrows, so understanding the information is easier.  Note, I have not yet added the indicators for the Wind Speed in this graph.

Here is the SAS 9.40 SGPLOT program for the graph.

title 'BC Concentration by Wind Speed and Direction';
proc sgplot data=wind aspect=1.0 noborder;
  scatter x=x y=y / colorresponse=bc markerattrs=(symbol=circlefilled)
               colormodel=(green yellow red) name='a';
  vector x=x2 y=y2 / xorigin=x1 yorigin=y1 arrowheadshape=barbed;
  text x=xl y=yl text=label / textattrs=(size=9);
  gradlegend 'a' / title='';
  xaxis display=none;
  yaxis display=none;
run;

In the data step for the graph, x and y are computed from R and Theta using the following formula.  You can see all the details in the program code linked below.

    x=r*cos(theta * PI / 180);
    y=r*sin(theta * PI / 180);

Wind_Graph_XYA simpler method would be to just plot the Wind Speed and and Wind Direction on the Y and X axes of a rectangular XY plot as suggested at the start of the article.  The result is shown on the right.  The data is exactly the same, but now we have plotted speed (R) on the Y axis and direction (Theta) on the X axis of the scatter plot.  See linked code below.

I would say the data is not as easy to understand in this presentation.  The feel for direction is lost, and also a discontinuity is created between 0 and 360.  A polar presentation is clearly more intuitive.

Note in the polar graph on top, I did not plot the indicators for the Wind Speed.  The SGPLOT procedure does not support a simple plot statement to draw circles to display the values for the speed along the radius.  Yes, we could plot the values along one axis, but that would not be so intuitive.  The circular grid lines can be drawn using  the SGANNOTATE "oval" function and that exercise is left to the motivated reader.

Wind_Graph_Polar_GTL2Instead, we can also make the same graph using GTL, which does support the ELLIPSEPARM statement that can be used to draw the circular grids.  Now, the circles provide the grid lines for the Wind Speed, and the values are displayed along one of the directional arrows.  Click on the graph for a higher resolution image.

Note, I have used the option ASPECT=1.0 to ensure that the display area is circular regardless of the size or aspect of the graph itself.

Full SAS 9.40 SGPLOT and GTL code:  Wind_Graph

Earlier, I had posted a similar method to Visualize the Temperature Data over Time.

Post a Comment

Bar Chart with Descending Response

Recently, I needed to view the list of products with the highest number of defects.  I have a data set of defects reported against various products.  The data set has over 30 products, and each observation contains the product name, name of the primary support person, and other relevant details of the defect.  My goal is to produce an uncluttered graph showing only the most significant information, and to also insert additional information (such as "Support" name) in the graph.

Product_1Here is a bar chart of the defects by product.  The graph on the right shows the number of defects by all the products in the data set as a horizontal bar chart, showing also the primary support person for the product.  Click on the graph for a higher resolution image.  The entire detailed data set is provided to the procedure, and the computation of the frequencies is done by the HBAR statement itself.

While we have achieved one goal (inserting additional information into the graph), there are too many bars with small defect counts cluttering up the graph.

Here is the SGPLOT code for the graph:

title "Open Defects for Product on &sysdate";
proc sgplot data=blog.product noborder;
  hbar product / datalabel categoryorder=respdesc datalabelfitpolicy=none;
  yaxistable product support / position=left location=inside;
  xaxis display=(nolabel noline noticks) grid;
  yaxis display=none valueattrs=(size=6) fitpolicy=none;
run;

Note the bars are displayed by descending  defect counts.  As we can see, most of the defects are in the first few products, and there are too many other product names in the graph with very few defects that I do not need to see.  This also makes the graph harder to read.

In the code above, I have used the HBAR statement with the CATEGORYORDER=RESPDESC which creates this graph with the descending order of the bars.  I have also used the YAXISTABLE to display both the product name and the primary support person.

Product_2The full detailed data is provided to the procedure and the HBAR is computing the number of defects by product and arranging them by descending statistic  (frequency).  So, there is no way for me to specify to the graph that I want to see only the products with more than 10 defects, as shown in the graph on the right.  This would be a nice feature to add to the procedure at some point.

To create the graph shown on the right, I have to first compute the defect count for each product using PROC MEANS.  But, I also need to carry through the "Support" column for plotting.  Note the use of the ID variable in the code below.  This allows me to get the name of the primary support person along with the product name in the output data set.  I sort the data set by the count, and then I can display the graph.  I can either show only the first 10 observations, or I could show only the bars with count > N.

My data has no numeric variables.  The MEANS procedure apparently did not like this.  So, I added a constant column "X" (=10).  This seems to have overcome the problem.

/*--Add a dummy analysis variable--*/
data product;
  set blog.product;
  x=10;
run;

/*--Compute frequencies by component, keep the primary support id--*/
proc means data=product noprint;
  class product;
  id support;
  var x;
  output out=freq(where=(_type_ > 0))
        N=N;
run;

/*--Sort data by descending frequency--*/
proc sort data=freq;
  by descending n;
run;

/*--Draw graph using HBARPARM with summarized data--*/
title "Open Defects for Product on &sysdate";
proc sgplot data=freq(where=(n>10)) noborder;
  hbarparm category=product response=n / datalabel ;
  yaxistable product support / position=left location=inside;
  xaxis display=(nolabel noline noticks) grid;
  yaxis display=none fitpolicy=none;
run;

Full SAS 9.40 SGPLOT code:  Bar_Chart

Post a Comment

Scalable Turnip Graph

A Turnip Graph displays the distribution of an analysis variable.  The graph displays markers with the same (or close) y coordinate by displaying the markers spread out over the x-axis range in a symmetric pattern.  Recently, a question was posted on the SAS Communities page regarding such a graph.

TurnipScatterHere is an example of display of the distribution of the data using a Turnip Graph.  In this example, the markers are "Binned" on the y-axis.  All markers in each bin are displayed symmetrically in the x direction.  The data requires the list of observations with same y value which are automatically displayed as a row of markers using the SCATTER plot with the JITTER option.  Click on the graph for a higher resolution view.

SGPLOT code for Turnip Graph:

title 'Distribution of Cholesterol by DeathCause';
proc sgplot data=turnipScatter noautolegend;
  scatter x=deathcause y=y / jitter;
  xaxis display=(nolabel);
  run;

One shortcoming for the graph above is that it does not scale well for moderately large data.  The graph above was created for a data about 225 observations with 4 category values.  I have intentionally reduced the data so it works for the graph above.  The number of markers just barely fit the space available.  As the observation count or number of categories increase, this method does not continue to provide good results.  Other methods can be used to actually compute the (x, y) of each observation which requires much more work for a general solution.

TurnipPanelTextAn alternate way which is relatively easy to build to view the same data is shown on the right.  Instead of displaying each marker, the graph displays a "bin" that represents all the markers in the bin.  All bins in the graph are scaled by the count in each bin so it is easy to see the relative distribution of the data.  The observation count is displayed in the bin.  Click on the graph for a higher resolution view.

As you can see, this graph scales very well for all kinds of data, with small or large observation counts and for different number of categories on the x-axis.  To prepare the data, we run an SGPANEL graph with the HISTOGRAM statement using the SCALE=COUNT option and save the resulting data in a data set using the ODS OUTPUT statement.  This saves the bins and the number of observations in each bin by category.  We mirror the data by creating a "Min" column equal to the negative value of the "Count" column.

We use the SGPANEL Procedure with the HIGHLOW plot to display the distribution in a panel.  We use a TEXT plot to display the bin counts and we turn off the cell headers and use a TEXT plot to display the categories at the bottom to make this look like a single cell graph.  TEXT is better than an INSET since it can split the long values on white space.

SGPLOT code for the Scalable Turnip Graph:

title 'Distribution of Cholesterol by DeathCause';
proc sgpanel data=turnip noautolegend;
  panelby deathcause / novarname layout=columnlattice  columns=4 noborder noheader;
  highlow y=y low=min high=max / type=bar barwidth=1
                 fillattrs=(color=lightgray) lineattrs=(color=black);
  colaxis display=none;
  rowaxis min=0 offsetmin=0.15 display=(noticks noline nolabel) grid;
  TurnipPaneltext y=y x=zero text=max / strip textattrs=(size=5);
  text y=ylbl x=zero text=label / strip splitpolicy=split
          position=bottom contributeoffsets=none;
run;

To view the relative distribution, bin counts are not really necessary.  Alternative visuals are shown below.  Full code for preparing the data and for creating the graph is linked below.  I am tempted to call this the "Spark-Plug Graph" or a "Spinning Top" graph.

ViolinPanelA "Violin Graph" can be created instead using the same data by using the BAND statement instead of HIGLOW.

Full code for Scalable Turnip Graph:  Turnip

 

Post a Comment

Infographics: Coin Stack Bar Chart

Often we see bar charts showing revenues or other related measures by a classifier using a visual of a stack of coins.  Such visuals are not strictly for the purposes of accurate magnitude comparisons, but more for providing an interesting visual to attract the attention of the reader.  In other words - Infographics.

Coins_India_Jitter_2I thought this would be a good exercise to see how we can do this using the SGPLOT procedure.  One such result is shown on the right.  Click on the graph for a higher resolution image.

I searched the web for some appropriate images of coins, anything with a perspective image of a coin that can be used to create a stack.  Then, I found a beautiful image of an antique "2-Annas" coin from British India.  The image of the coin has beautiful shine, good resolution, unusual shape and clear details that makes for nice stacks of coins as shown above.

BarChartThe default Bar Chart would look like the graph on the right.  While it accurately conveys the information clearly, in some instances it is a bit boring compared to the graph above.

The data for the graph is very simple as show on the right below the graph, and the program is shown below.

SGPLOT code for Bar Chart:

BarDatatitle 'Revenues (Millions) by Year';
proc sgplot data=Bar noborder noautolegend;
  vbar cat / response=resp fillattrs=graphdata1 dataskin=pressed
                    datalabel datalabelattrs=(size=12 weight=bold);
  xaxis display=(noticks noline nolabel) integer ;
  yaxis display=(nolabel noticks noline) min=0 integer grid;
run;

Now, to create the graph of the pile of coins, we need to render each coin in the stack individually, using a SCATTER plot where the marker symbol is built from the image of the coin.  CoinsDataWe process the original data set, and generate an observation for each coin with increasing y value in the data.  Then, the default rendering order (which is data), the later (higher) coins will be drawn over the earlier coins, thus creating a stack.

The data generated for the coin stack is shown on the right.  Note, the response value is kept only once for each category value.

We use the SYMBOLIMAGE statement to define the symbol.  One must use a "transparent" image, where the pixels outside the coin part are transparent.  We also use the JITTER option so each coin is shifted along the x-axis a bit to simulate a "real" stack.  Else, the stack will be too straight.  This jitter option works best when the x-axis values are numeric.

SGPLOT code for Coin Stack Graph:

title 'Revenues (Millions) by Year';
proc sgplot data=coins noborder noautolegend;
  symbolimage name=Coin image="&Coin";
  scatter x=cat y=val / markerattrs=(symbol=Coin size=70) jitter jitterwidth=0.03;
  text x=cat y=resp text=resp / textattrs=(size=14 weight=bold color=white)
         strip position=top backlight=0.75;
  xaxis display=(noticks noline nolabel) integer offsetmin=0.15 offsetmax=0.15;
  yaxis display=none offsetmin=0.2 offsetmax=0.2;
run;

Coins_India_2The value of the stack is displayed on the top of the coins.  Note use of "Backlight" option to generate a darker outline around the text, so it is visible on top of the light colored coins.  Any nice image of a coin can be used.  The number of coins drawn should depend on the "thickness" of the coin in the image.  The 2-Annas coin is thin, so we need more coins.  The graph on the right is without jitter, which creates even stacks.

Coins_Somali_RedThe Somalian Silver coin is "thicker", so we need less numbers as shown on the right.

Full SAS9.40M3 code:  CoinGraph

Post a Comment

Good Graph: Magnitude Comparisons

At the 2013 SAS Global Forum, I presented a paper titled "Make a Good Graph" which reviewed some of the features that make for a good graph.  This paper presents an aggregation of ideas from various sources, including some recommendations from thought leaders in the graphics arena such as Edward Tufte, William Cleveland and Naomi Robbins.

circlesamp (1)Recently, a question was posted on the SAS Communities site asking how to create the graph shown on the right using SAS.   This graph is showing sales figures by company (and peer) by region using a bubble plot.

There are two issues here:

  • How to make such a graph?
  • Should you make such a graph?

Sales_Bubble_3The answer to the first one is simple.  The SAS SGPLOT procedure supports the BUBBLE statement that can create a graph like the one shown above.

On the right is one I created for the simple data in the plot using SGPLOT procedure.  It is relatively easy to make and the generated visual is mostly like the one above, with a few differences.  A more exact match can be created, but I stopped here. Here is the code.

Bubble Plot Code:

ods graphics / reset width=4in height=1.75in noborder imagename='Sales_Bubble';
title 'Sales by Region';
proc sgplot data=sales noborder noautolegend;
bubble y=Group x=Category size=value / bradiusmax=25 bradiusmin=12
group=category dataskin=pressed datalabel=value
datalabelpos=center datalabelattrs=(color=white size=8 weight=bold);
yaxis display=(noline noticks nolabel) fitpolicy=split valueattrs=(size=8);
xaxis display=(nolabel noticks noline) valueattrs=(size=8);
run;

Assuming the purpose of the graph is to better understand a company's sales vis-a-vis a peer, the second question becomes relevant.  Using the bubble plot, it is relatively hard to make accurate magnitudes comparisons of sales figures between the company and its peers without the help of the numbers in the bubble.

The visual shown above would not be the best one to facilitate accurate magnitude comparisons.  It has been shown by studies on the subject that using areas for comparison of magnitude is not very effective.  A better way for such a goal would be usage of linear line segments from a common baseline.  Also, it helps  to bring the items to be compared close to each other.

Sales_BarThe clustered bar chart on the right provides a better visual for magnitude comparisons of sales by region between company and its peer.  Putting the company and peer values adjacent allows for better comparisons which are clearly visible even without the numbers on the bars.

Bar Chart Code:

ods graphics / reset width=4in height=2.5in noborder imagename='Sales_Bar';
title 'Sales by Region';
proc sgplot data=sales noborder;
styleattrs datacolors=(darkgreen gold);
vbarparm category=Category response=value / group=Group
groupdisplay=cluster dataskin=pressed datalabel
datalabelattrs=(color=black size=8 weight=bold);
keylegend / title='';
yaxis display=(noline noticks nolabel) grid;
xaxis display=(nolabel noticks);
run;

Linear distance from common baseline along with proximity of items to be compared create a better graph.  I am thinking it would be a good idea to have a thread for topics on how to create a "Good Graph".  A bit close to "Good Grief", made famous by Peanuts.  🙂

Full SAS 9.40M3 code:  Magnitude

Post a Comment

CTSPedia Clinical Graphs - Subgrouped Forest Plot

The advent of the AXISTABLE statement with SAS 9.4, has made it considerably easier to create graphs that include statistics aligned with x-axis values (Survival Plot) or with the y-axis (Forest Plot).  This statement was specifically designed to address such needs, and includes the options needed to control the text attributes of the data and also any indentations that may be needed.

In previous posts, I have described the use of these new statements, but it seems I did not provide a full program for the "Subgrouped Forest Plot", one of many popular clinical graphs.  Here we can use the YAXISTABLE available in SGPLOT for this graph

Subgroup_Forest_SG_94Here is the graph I created using the SGPLOT procedure.  Click on the graph to see a higher resolution image.  The details for the graph are as follows:

  • A Hazard Ratio plot in the middle.
  • Study names on the far left.  The study names are subgrouped, with label and values.  The labels have bolder font and the values are indented.
  • Number of patients with % on the left.
  • Event rates for PCI Group, Therapy Group and p-value on the right.
  • Note the use of Unicode arrow characters for the annotations on the axis created using the TEXT plot statement.  This is done using the ability to add Unicode values to a User Defined Format in SAS 9.4M3.
  • SG Annotation code is NOT used in this graph.

SAS 9.40M3 code:

title j=r h=7pt '4-Yr Cumulative Event Rate';
ods graphics / reset width=5in height=3in imagename='Subgroup_Forest_SG_94';
proc sgplot data=forest_subgroup_2 nowall noborder nocycleattrs dattrmap=attrmap noautolegend;
  format text $txt.;
  styleattrs axisextent=data;
  refline ref / lineattrs=(thickness=13 color=cxf0f0f7);
  highlow y=obsid low=low high=high;
  scatter y=obsid x=mean / markerattrs=(symbol=squarefilled);
  scatter y=obsid x=mean / markerattrs=(size=0) x2axis;
  refline 1 / axis=x;
  text x=xl y=obsid text=text / position=bottom contributeoffsets=none strip;
  yaxistable subgroup / location=inside position=left textgroup=id labelattrs=(size=7)
                      textgroupid=text indentweight=indentWt;
  yaxistable countpct / location=inside position=left labelattrs=(size=7) valueattrs=(size=7);
  yaxistable PCIGroup group pvalue / location=inside position=right pad=(right=15px)
                      labelattrs=(size=7) valueattrs=(size=7);
  yaxis reverse display=none colorbands=odd colorbandsattrs=(transparency=1) offsetmin=0.0;
  xaxis display=(nolabel) values=(0.0 0.5 1.0 1.5 2.0 2.5);
  x2axis label='Hazard Ratio' display=(noline noticks novalues) labelattrs=(size=8);
run;

I have also added this code to the CTSPedia page for Subgrouped Forest Plot.

Full SAS 9.40M3 code:  Subgrouped_Forest_Plot_SG_94

 

Post a Comment

CTSPedia Clinical Graphs - Volcano Plot

A Volcano Plot is a type of scatter-plot that is used to quickly identify changes in large data sets composed of replicate data.  In the clinical domain, a Volcano Plot is used to view Risk difference (RD) of AE occurrence (%) between drug and control by preferred term.

One example of a volcano plot, P-risk Odds Ratio of Treatment Emergent Adverse Events is contributed by Qi Jiang and is included in the list of Clinical Graphs on the CTSPedia web site.  The graph is a used for safety signal screening and AE data display.  It allows investigators to evaluate AE risks using both estimates of risk difference and p-values.   Optional reference lines are added so that AEs with large RD and small p-values can be identified in the upper right corner of the plot.

Volcano_RRI took the data from the example on the CTSPedia example and used the SGPLOT procedure to create the graph shown on the right.  Click on the graph for a higher resolution view.

The graph plots the log of the p-values by the log of the Odds Ratio by AESOC.    For this graph, I have displayed the AE terms for p-values > 0.05 are labeled.  The scatter plot is by AESOC, and the I have set a format to display 10 characters in the legend which is placed on the right of the plot.

title 'P-risk (Odds Ratio) Plot of Treatment Emergent Adverse Events at PT Level';
proc sgplot data=sample2;
  format aesoc $10. text $txt.;
  label p_rr='Fisher Exact p-value';
  label rr='Odds Ratio';
  scatter x=rr y=p_rr / group=aesoc datalabel=label name='a';
  refline 1 / axis=x lineattrs=(pattern=shortdash);
  refline 0.05 / axis=y lineattrs=(pattern=shortdash);
  inset ("Placebo:" = "n/N(%)=&inset1"
              "Treatment:" = "n/N(%)=&inset2") / noborder position=topleft;
  text x=xlbl y=ylbl text=text / position=bottom contributeoffsets=(ymax);
  yaxis reverse type=log values=(1.0 0.1 0.05 0.01 0.001) offsetmin=0.1;
  xaxis type=log values=(0.1 1 2 5 10) valueshint;
  keylegend 'a' / across=1 position=right valueattrs=(size=6);
run;

It was not obvious to me  how the inset values were computed.  These would computed in a data step and inserted into macro variables for display in the graph.  So I just assigned the values into macro variables and used those in the INSET statement.

Also note the use of the Unicode characters for the left and right arrow in the "Favors" labels.  We do this by making these texts as "T" and "P" in the data, but use a user defined format that includes Unicode values in the text.

If you compare with the code on the CTSPedia site, you will see the SGPLOT code is very concise and does not require use of Annotation.  This makes the graph code more robust and usable with other data.

Full SAS 9.40M3 SGPLOT code:  VolcanoPlot

Post a Comment

CTSPedia Clinical Graphs - Heatmap of Benefit

Let us continue our review of the Clinical Graphs included in the CTSPedia repository.  Today, I noticed this Heatmap of Benefits and Risks over Time for Subjects in a study by Treatment, submitted by Max Cherny using "R" code.  I thought it would be a good exercise to see how to build this graph using SAS.  You may notice, I have already added the SAS version to the CTSPedia Heat Map page.  This graph was intentionally created to mimic the graph by Max to avoid any variability.

HeatMapPanel3Here is the same graph created using SAS SGPANEL procedure.  In this example I have softened the colors used by Max.  The graph is pretty much the same.  Click on graph for a higher resolution view.

The full code for generating the data and rendering the graph is linked below.  You will see that most of the code is needed to generate the data set as I was unable to run Max's R code to do so.

The SGPANEL code necessary to create the graph itself is very concise, under 10 lines.

Here is the SGPANEL code:

proc sgpanel data=HeatMap ;
  format value benefit.;
  styleattrs datacolors=(cx3faf3f yellow lightgray lightred gray);
  panelby trt / novarname spacing=10 headerbackcolor=lightgray;
  heatmapparm x=week y=subject colorgroup=value / name='a';
  colaxis integer offsetmin=0 offsetmax=0 display=(noline);
  rowaxis values=(10 to 100 by 10) min=1 valueshint
                 offsetmin=0 offsetmax=0 display=(noline);
  keylegend 'a' / valueattrs=(size=7) noborder;
run;

The full code for the graph is linked below.  I have used some options to set the size of various fonts to mimic the "R" look for comparison.

Full SAS9.4 SGPANEL code:  HeatMap2

Post a Comment

Dial KPI using SGPLOT

Last week I was at PharmaSUG 2016, where I presented a 1/2 day seminar on creating Clinical Graphs using SAS.  I was gratified to have a enthusiastic audience of about 28 attendees and we had a great interactive session.  I also presented a paper on Clinical Graphs Using SAS.  More on PharmaSUG 2016 soon.

While at PharmaSUG, I read a post by a user interested in creating KPI dials of the type that can be found on the SAS Support Page.  User wanted to know if such KPI Dials can be created using GTL.  Interestingly, a couple of attendees at the conference also expressed interest in it.  Also, Kirk Lafler demonstrated a way to create Dashboards using Base SAS software.  I figured it would be an interesting exercise to create one using SGPLOT procedure and may actually be useful for multiple applications.

KPI_Dial_Pastel_1The result is shown on the right.  Click on the dial for a higher resolution view.  I wanted to create something that would look like a real dial, and not with a "flatland" rendered look.  The dial on the right has a three zones with ticks and values, a needle showing the current value, and the value is also displayed at the bottom of the dial.

What distinguishes this from the ordinary "flatland" graphic is the nice shiny outer chrome ring, and reflective highlights on the surface of the dial simulating a glass cover.  It may be nice to add a small window frame for the value.

The chrome ring is displayed using an image of a ring with transparent outer regions.  The inner regions need not be transparent as it will be covered with the dial details.

KPI_Dial_Pastel_2_BA second version is shown on the right.  Here I have used a pastel shade for the dial background and set the value to 23.  The intensity of the reflection map is reduced by increasing its transparency.

The full program for computing the data and the SGPLOT program is attached below.  It uses two image files, one for the dial background and one for the dial reflection map.  If you want to run the program, be sure to use images that will provide similar results.

It should be possible to write a macro to create such Dial KPIs on the fly, where you can define the number of ranges and their values and the value for the dial.  Should be doable by converting the code attached into a macro.

SAS 9.4 SGPLOT Code:  KPI_Dial_2

 

Post a Comment

CTSPedia Graphs - Dot Plot of Primary SOC

CTSPedia is a valuable resource for clinical research "... initiated to form an information resource created by researchers for researchers in clinical and translational science to share valuable knowledge amongst local researchers".

This site includes a section on statistical graphs where you can find valuable information and a library of standardized graphs for the Clinical Trials industry.    The library includes many graphs, some having "R" code and some "SAS".  Many SAS graphs are now a bit dated, using SAS 9.2 features, and could be refreshed using the newer SAS 9.4 features.

One graph of interest is the Dot Plot of Primary SOC by Matt Soukup created using R.  I thought it would be a good exercise to do the same graph using SAS 9.4.

Dot_SOC_SAS_AxisTable_AI was not able to run Matt's R program to get the data, but it was easy enough to enter the pct values by hand. The resulting graph is shown on the right and the SAS code is below.  Click on the graph for a high resolution image.

proc sgplot data=dot_sort noborder;
  dropline y=classification x=pct / dropto=y;
  scatter y=classification x=pct /
               markerattrs=(symbol=circlefilled);
  yaxistable classification pct / location=outside
              position=left pad=10 valuejustify=right ;
  xaxis min=0 grid offsetmin=0 label='Relative Frequency of an Event';
  yaxis fitpolicy=none valueattrs=(size=7) reverse display=none;
run;

My goal is to get as close to the graph in the CTSPedia library in appearance.  A user can always further customize it.  The graph above is almost identical to the graph shown in the CTSPedia link, and easy to create using the YAxisTable.

I am thinking it would be useful to write up a thread of such "CTSPedia" graphs which can be easily searched in the blog using the keyword "CTSPedia".

I will be at PharmaSUG 2016 next week in Denver.  I look forward to seeing you there.   I am presenting a 1/2 day seminar on Saturday on "Clinical Graphs using SAS", a paper and a SuperDemo on the same topic.  Stop by and say "Hello" at the "Meet the presenters" booth if you have time.

Full SAS 9.4 SGPLOT code:  Dot_Plot_SAS

Post a Comment