# Compact Scatter Plot Matrix

The Scatter Plot Matrix is a great tool that provides a quick visual of potential associations between variables.  This may provide the analyst some hints on how to proceed with the analysis.

Matrix of lab values for liver function tests are commonly used in clinical research.  The SGSCATTER procedure provides an easy way to create matrix graphs as shown below.  Click on the images for higher resolution image.

3x3 Matrix view of lab values:

Proc SGSCATTER Code:

```title '3x3 Scatter Plot Matrix'; ods graphics / reset width=4in height=4.5in imagename='Matrix_3x3'; proc sgscatter data=safety; matrix asat alat alkph; run;```

4x4 Matrix view of lab values with distribution plots:

Proc SGSCATTER Code:

```title '4x4 Scatter Plot Matrix with Diagonals'; ods graphics / reset width=5in height=5.5in imagename='Matrix_4x4_Diag'; proc sgscatter data=safety; matrix asat alat alkph biltot / diagonal=(histogram normal); run;```

There are a few issues with these graphs from a clinical perspective.

1. The matrix statement does not provide any way to customize the axes.
2. There is no way to indicate the clinical concern levels in the graphs.
3. The upper triangle of the matrix is a mirror image of the lower triangle, and hence wasteful of the space.

A closer examination of the graph indicates that we can eliminate the top row and right column of the matrix, to get a smaller 3x3 arrangement of the 4 variables, and still have all the pairwise scatterplot combinations for all the variables.  In fact, this arrangement is popular in the clinical domain as shown below in what can be called the "Compact Matrix".

Compact Matrix for 4 variables:

This matrix has the following features:

• All six pairwise combinations of the four variables are included in the graph.
• The matrix occupies only a 3x3 grid for 4 variables hence uses the space more efficiently.
• Drawing of the upper triangle is eliminated resulting in a cleaner, uncluttered graph.
• Axes are customized.
• Clinical concern levels are indicated by test case.

Given that this arrangement is very popular, we will likely include an option to draw compact matrices in the next release based on the work shown here.  But how do we do this now?

For SAS 9.2 and SAS 9.3, the ScatterPlotMatrix both in GTL and SGSCATTER already use the LAYOUT LATTICE in GTL to create this graph.  So, it is possible to write a macro to draw a  "CompactMatrix" using the lattice layout, axis options and reference lines in GTL to create this graph.

Macro invocation for 4-variable Compact Matrix:

```%CompactMatrixMacro(data=safety, var1=asat, var2=alat, var3=alkph, var4=biltot, title=Compact 4 Variable Scatter Plot Matrix, footnote=For ASAT ALAT and ALKPH the clinical concern level (CCL) is 2 ULN, footnote2=For BILTOT the clinical concern level (CCL) is 1.5 ULN, footnote3=Where ULN is the upper level of normal range, titlefontsize=10, footnotefontsize=7, axisvalueincr=1);```

The macro is written to illustrate the technique.  It only handles 3, 4 or 5 variables, but can easily be extended to handle more.  The code is likely far from bullet proof.   Here are some more output examples with code.

Compact Matrix for 3 variables:

Macro invocation for 3-variable Compact Matrix:

```%CompactMatrixMacro(data=safety, var1=asat, var2=alat, var3=alkph, title=Compact 3 Variable Scatter Plot Matrix, footnote=For ASAT ALAT and ALKPH the clinical concern level (CCL) is 2 ULN, footnote2=For BILTOT the clinical concern level (CCL) is 1.5 ULN, footnote3=Where ULN is the upper level of normal range, titlefontsize=9, footnotefontsize=6, axisvalueincr=1);```

Compact Matrix for 5 variables:

Macro invocation for 5-variable Compact Matrix:

```%CompactMatrixMacro(data=safety, var1=asat, var2=alat, var3=alkph, var4=biltot, var5=lab5, title=Compact 5 Variable Scatter Plot Matrix, footnote=For ASAT ALAT and ALKPH the clinical concern level (CCL) is 2 ULN, footnote2=For BILTOT the clinical concern level (CCL) is 1.5 ULN, footnote3=Where ULN is the upper level of normal range, footnotefontsize=8, axisvalueincr=1);```

The CompactMatrixMacro has the following features:

• The macro accepts 3, 4 or 5 variables.
• You can provide the upper CCL levels for each variable.
• The lower CCL level is set to 1.0.
• You can set axis range (same for all variables).
• You can set two titles and 3 footnotes, each with its own text font size.

Caveat Emptor:  The macro is for illustration purposes only, not bullet proof and not tested.

Macro program and invocation code is attached:  CompactMatrixMacro_Code

1. Max
Posted August 27, 2012 at 10:37 am | Permalink

Is it possible (or easy anyway) to modify this to include the histograms back in? I really like having them and the ability to control the axes, even if it's not any more compact than the original.

• Sanjay Matange
Posted August 28, 2012 at 11:57 am | Permalink

I would think you can do that. Here are the steps for the process:
1. Build the full N x N matrix.
2. Populate only the lower triangle with scatter plots.
3. Add Histograms to the diagonal elements.
4. Set external axes with the appropriate axis ranges.
5. Turn off the display of axis for the top row and right column.
6. Put a Layout Overlay around each histograms to decouple its axis with the common external axis.

2. Martin
Posted February 20, 2014 at 3:23 pm | Permalink

How can I add a title above each of the individual graphs?

• Martin
Posted February 20, 2014 at 3:45 pm | Permalink

Note that I was able to use a "layout gridded" step to get what i needed.

3. Martin
Posted February 21, 2014 at 8:26 am | Permalink

layout lattice code that allows one to enter any # of variables (> 2):

layout lattice / columns=%eval(&numvars - 1) rows=%eval(&numvars - 1) rowgutter=5 columngutter=5
rowdatarange=union columndatarange=union;

* set common row options;
rowaxes;
%do k = 1 %to %eval(&numvars - 1);
rowaxis / tickvalueattrs=(size=7pt) labelattrs=(size=10pt) griddisplay=on
linearopts=(tickvaluesequence=(start=&axismin end=&axismax increment=&axisincr)
tickvaluepriority=true);
%end;
endrowaxes;

* set common column options;
columnaxes;
%do l = 1 %to %eval(&numvars - 1);
columnaxis / tickvalueattrs=(size=7pt) labelattrs=(size=10pt) griddisplay=on
linearopts=(tickvaluesequence=(start=&axismin end=&axismax increment=&axisincr)
tickvaluepriority=true);
%end;
endcolumnaxes;

%do n = 2 %to &numvars;
%do m = 1 %to %eval(&numvars - 1);
%if &m < &n %then %do;
* draw individual scatter plots;
layout overlay;
scatterplot y=&&var&n x=&&var&m / datalabel=&labelvar datalabelposition=center markerattrs=(size=0);
endlayout;
%if &n = %eval(&m + 1) %then %do o = 1 %to %eval(&numvars - &n);
layout overlay; entry ''; endlayout;
%end;
%end; %end; %end;

endlayout;

4. Caro
Posted November 10, 2014 at 11:37 pm | Permalink

I have managed to expand to be a compact graph of 6 variables which is great.
However, my variables have quite different ranges of results so I would like to let each row have a different range of the y axis and each column to have a different range on the x axis. Then I don't want the values printed on the plots other than on the outside of the matrix. Any suggestions?

The other thing I am trying to do is put the estimated correlation for each panel somewhere on the corresponding plot. I thought I might be able to do it with annotate but can't see how to apply the annotated dataset to the template.

Welcome to Graphically Speaking, a blog by Sanjay Matange focused on the usage of ODS Graphics for data visualization in SAS. The blog will cover topics related to the Statistical Graphics procedures, the Graph Template Language and the ODS Graphics Designer. Sanjay and Dan are the co-authors of Statistical Graphics Procedures by Example: Effective Graphs Using SAS
Sanjay is the author of: Getting Started with the Graph Template Language in SAS