During SGF 2012, I had conversations with many SAS users who wanted to create Forest Plots. However, there was one new twist. The study names were subgrouped by categories like 'Age', 'Sex', etc., with multiple entries under each subgroup. The name of each study within the subgrouped was indented to indicate the grouping.
This also came up in a recent discussion with the folks at CTSPedia, who also wanted to create a similar subgrouped forest plot shown below (click on graph for bigger version):
The graph itself can be easily created using GTL, but the main issue was the indentations needed in the subgrouped study names. In GTL and SG Procedures, leading and trailing blanks are removed from the axis tick values and markercharacter strings. So, how do we include the indentations?
Earlier, I discussed using a non breaking space for a simple Forest Plot using SGPLOT procedure. In that article I also provided a sneak preview of this graph. It looks like this nbsp is becoming my good friend.
Here we are using nbsp in place of a regular space for both the leading and trailing blanks. To help get the indentations right in the dataset, I first use a dot for all leading and trailing blanks in the study names. These are easier to see. Then, I simply replaced all dots by an nbsp ('A0'x) using the translate() function. Remember, we have to use a non-proportional font to ensure the all characters have consistent width.
Here is the graph I created using SAS 9.2 release. Note: I did not receive the actual data set for the original graph, so the values for Mean LCL and UCL are eyeballed from the graph above and may not be very accurate. The focus of the exercise is making the graph, given the data.
The code will be familiar to the GTL programmer. Here are the basic steps for the template:
- Use a LAYOUT LATTICE with four columns for the main graph.
- Weights for the columns are (0.23 0.07 0.4 0.3).
- The study names are displayed in the first column using SCATTERPLOT with the MARKERCHARACTER option. A non-proportional font is used to display these strings.
- The number of patients and % are shown in the second column also using the scatter plot with marker character option. In this case, a non-proportional font is not necessary.
- The Hazard Ratio plot is shown in the third column with custom x axis tick values and label.
- PCI, Group and p-values are shown in the last column.
- For arranging the headers correctly, I used another 2x4 lattice, with slightly different weights and populated each cell with the string needed.
In such cases where the plot and data are aligned horizontally across a wide graph, it is helpful to provide a guide to the eye to keep things lined up across the page. Something similar to the old 132 character line printer page with the green bands is helpful. So, I used a trick to draw wide grid lines behind alternate blocks of observations. I also added a background color for the headers. Here is the graph:
In the graph above, the bands help the eyes track the data across the wide page. In the graph above, I used the scatter plot with the marker character option to do all the textual columns, including the study names on the left. This allowed me to put the shaded bands behind the full width of the graph.
An earlier version of this same Forest Plot is posted on the CTSPedia page. In this graph, the study names are Y axis tick values. The bands (using reference lines) cannot extend under the Y axis tick values.
Using SAS 9.3, the serifs for the error bars can be eliminated by using a HIGHLOWPLOT statement to plot the confidence interval.
Full SAS 9.2 program: ForestPlot_92
Full SAS 9.3 program: ForestPlot_93