A heat map is a graphical representation of a matrix that uses colors to represent values in the matrix cells. Heat maps often reveal the structure of a matrix. There are three common applications of visualizing matrices with heat maps:
- Visualizing a correlation or covariance matrix reveals relationships between variables. Chris Hemedinger has written an article that describes how to visualize correlation matrices by using a heat map.
- Visualizing a data matrix reveals outliers, missingness patterns, and more. I will discuss this application in a future blog post.
- The first two applications are usually visualized by using a color ramp with a continuous color gradient. If the matrix contains a small number of discrete values, it is preferable to use a discrete palette of colors. Heat maps with discrete color palettes are useful for visualizing structured covariance matrices and the nonzero pattern of sparse matrices.
This article describes how to use a heat map to visualize matrices that contain a small number of discrete values. (EDIT: As of SAS 9.4m1, there is an easier way to create heat maps of matrices in SAS/IML. See the articles about continuous heat maps and discrete heat maps.)
A structured covariance matrix
In my book Simulating Data with SAS, I simulate data from a repeated-measures model that has a block-diagonal covariance structure. The following SAS/IML statements create a 45 x 45 matrix that consists of nine 5 x 5 blocks:
proc iml; k=5; /* number of repeated measurements */ s=9; /* number of individuals */ B = 1.4*j(k,k,1) + 2*I(k); /* compound symmetric matrix */ R = I(s) @ B; /* block-diagonal matrix */ print R;
This matrix is too large to easily view in printed form, but you can create a heat map that visualizes the matrix by assigning colors to the three values in the matrix.
Create a data set for the matrix in "long form"
I need to write the SAS/IML matrix to a data set so that it can be read by PROC SGRENDER, which will create the heat map by using a custom GTL template. It turns out that the HEATMAPPARM statement in the GTL language requires that the data set represent the matrix in "long form," which I have discussed in a previous blog post. For specific details, you can download the SAS program that generates the plots in this article.
A template for visualizing a matrix with a small number of unique values
The template to visualize the heat map is straightforward. It contains the following noteworthy features:
- The DYNAMIC statement enables you to specify the names of the data set variables at run time. I like to use this statement so that I can re-use my templates, but you are welcome to hard-code the names of the variables into the template if you prefer.
- The LAYOUT OVERLAY statement specifies three things.
- It specifies the aspect ratio of the plot so that square matrices look square. The aspect ratio interacts with the height and width of the graph as set by the ODS GRAPHICS statement.
- It specifies that the axes are discrete, rather than continuous.
- It specifies the features of the axes to display. For matrices with fewer than 100 rows or columns, I like to display tick marks and values. For larger matrices, I don't.
- The HEATMAPPARM statement creates the heat map from the data.
- The DISCRETELEGEND statement creates a legend that shows the association between the matrix values and the colors.
proc template; define statgraph HeatmapDisc; dynamic _X _Y _Z; begingraph; /* NOTE: Use the TYPE=DISCRETE statements if your version of SAS is before SAS 9.4m3 */ layout overlay/ aspectratio=1 /* optional: for square matrices */ xaxisopts=( /* type=discrete */ discreteopts=(tickvaluefitpolicy=THIN) display=(line ticks tickvalues)) yaxisopts=( /* type=discrete */ discreteopts=(tickvaluefitpolicy=THIN) display=(line ticks tickvalues) reverse=true); heatmapparm x=_X y=_Y colorgroup=_Z / xbinaxis=false ybinaxis=false name="heatmap" primary=true display=ALL; discretelegend "heatmap"; endlayout; endgraph; end; run; proc sgrender data=BlockDiag template=HeatmapDisc; dynamic _X="col" _Y="row" _Z="X"; run;
Clearly, the heat map has an advantage over the printed output. The display is smaller, and the global structure of the matrix is readily apparent. At a glance you can see that the matrix is composed of 5 x 5 blocks that contain a large value on the diagonal and smaller values on the off-diagonal. The remaining matrix values are zero.
Visualizing a sparse or binary matrix
Another common application of visualizing matrices is using a heat map to show the structure of a sparse matrix (zero and nonzero cells) or matrices that occur in experimental designs. For example, Hadamard matrices are used to make orthogonal array experimental designs for two-level factors. The following SAS/IML statement creates a 64 x 64 matrix that contains the values 1 and –1:
X = hadamard(64); /* 64 x 64 Hadamard matrix */
If you write that matrix (in "long form") to a SAS data set, you can visualize it by using the same GTL template:
proc sgrender data=Hadamard template=HeatmapDisc; dynamic _X="col" _Y="row" _Z="X"; run;
Again, the heat map makes the global structure of the matrix apparent. At a glance you can see that the matrix is composed of two values in a pattern that has many symmetries. Closer inspection reveals that the matrix is symmetric (X = X`) and that each row and column has an equal number of positive and negative values. You can also pick out a "self-similar" structure in the sense that the matrix is composed of four 32 x 32 Hadamard blocks, which are themselves composed of four 16 x 16 Hadamard blocks, and so on, recursively.
In this article, I let the SGRENDER procedure pick default colors for the heat maps. The colors come from the current ODS style, which you can change. Alternatively, you can specify colors in your template, which I will demonstrate in a future blog post.