Automate the creation of a range attribute map in SAS

0

In SAS, range attribute maps enable you to specify the range of values that determine the colors used for graphical elements. There are various examples that use the GTL to define a range attribute map, but fewer examples that show how to use a range attribute map with PROC SGPLOT. The documentation for the SAS ODS Graphics: Procedures Guide contains two examples:

Unfortunately, the examples are not typical. Each example assigns a single color to a range of values. In practice, a range attribute map is often used to assign a gradient color ramp so that each value in a range is assigned a unique color.

This article shows two simpler examples of using a range attribute map in PROC SGPLOT. The first example is a scatter plot. The markers are assigned a color according to the value of a response variable within a specified range. The second example is a heat map of a correlation matrix. The cells are assigned colors according to a color ramp defined on the interval [-1, 1]. Both examples use a linear mapping from a specified data range to the color model.

For a more advanced example, see Kuhfeld's article (2017, "Advanced ODS graphics: Range attribute maps"), which uses a range attribute map to display a heat map for a frequency table, along with marginal distributions for the row and column sums.

A simple color ramp without a range attribute map

Let me be clear: You do not have to use a range attribute map if you want to use a color model that is defined on the range of the data. By default, PROC SGPLOT will display colors that vary in the range [min, max], where min is the minimum value of a variable and max is the maximum value. For example, the following example is taken from an article that shows how to color-code markers in a scatter plot according to a response variable:

/* example from https://blogs.sas.com/content/iml/2016/07/18/color-markers-third-variable-sas.html */
title "Markers Colored by Age";
title2 "No Range Attribute Map";
proc sgplot data=sashelp.class;
scatter x=height y=weight / colorresponse=age
   colormodel=(CX3288BD CX99D594 CXE6F598 CXFEE08B CXFC8D59 CXD53E4F)
   markerattrs=(symbol=CircleFilled size=14) filledoutlinedmarkers;
xaxis grid; 
yaxis grid;
run;

This plot shows the heights and weights for 19 students. The COLORRESPONSE= option specifies that the markers be assigned colors according to the value of the AGE variable. The ages of the students in this data set range from 11 to 16. The markers are assigned colors according to the AGE value and the color model. In this example, I used the COLORMODEL= statement to define a custom color ramp, but you can skip that option to use a default color model. This example shows that you do not need to use a range attribute map if you want the colors to map to the data range [min(AGE), max(AGE)].

A range attribute map for a scatter plot

So, when is a range attribute map useful? In the previous section, there was one sample of 19 students. But suppose you obtain two more samples of students. In one sample, the age of the students range from 10 to 15. In another sample, the age of the students range from 12 to 18. If you want to create scatter plots for the three samples, it would be helpful if they all used a common color scale that goes from 10 to 18, which represents the range of all ages. When you use a common color scale, a yellow marker in one plot represents the same age as a yellow marker in another plot.

A range attribute map is a SAS data set that has special variables and values. A range attribute map enables you to specify the range that is used to assign attributes. The documentation explains the names of the variables and their values, but it is important to realize that there are two different types of graphical elements in SAS ODS graphics:

  • The ALTCOLOR variable assigns a color to the lines, markers, and text in a range. Similarly, the ALTCOLORMODEL1 – ALTCOLORMODELk variables create a linear gradient of colors across a range.
  • The COLOR variable assigns a fill color to the bars, polygons, and other "area" elements. Similarly, the COLORMODEL1 – COLORMODELk variables create a linear gradient of colors across a range.

The scatter plot example in the previous section assigns colors to markers. A range attribute map for markers requires using the AltColorModel1 – AltColorModelk variables. Let's hard-code the color ramp values into a range attribute map, as follows:

/* First method: manually construct a range attribute map that has a custom color model. 
   Markers use the ALTColorModeln variables */
data AgeRangeAttrMap;
length ID $20;
length min max $12;    /* Note: Using CHARACTER vars for min/max */
array AltColorModel[6] $32;  /* use ALTcolormodel array for MARKERS */
input ID min max AltColorModel1-AltColorModel6;
datalines;
AgeID 10.0 18.0 CX3288BD CX99D594 CXE6F598 CXFEE08B CXFC8D59 CXD53E4F
;
 
proc print noobs; run;

The AgeRangeAttrMap data set contains a map named AgeID. When you use this map, the colors are assigned according to the range [10,18]. The colors are assigned according to a linear interpolation that uses six colors, whose hexadecimal values are specified. (The mapping accepts other color names. For example, you could use five-color model with values DarkBlue, LightBlue, WhiteSmoke, LightRed, and DarkRed.) In this example, you could use numerical variables for MIN and MAX. However, I used character variables because the MIN and MAX variables accept certain text keywords. For the details, see the documentation.

You can use this range attribute map in a plot by making small modifications to the previous call to PROC SGPLOT:

  • Add the RATTRMAP= option to the PROC SGPLOT statement and specify the name of the data set that contains the map.
  • Add the RATTRID= option to the SCATTER statement and specify the name of the map.
  • Remove the COLORMAP= option from the SCATTER statement because the color map is now specified in the map.
title "Scatter Plot with Colored Markers";
title2 "Range Attribute Map for [10, 18]";
proc sgplot data=sashelp.class rattrmap=AgeRangeAttrMap;    /* <== add HERE */
scatter x=height y=weight / colorresponse=age rattrID=AgeID /* <== add HERE */
   markerattrs=(symbol=CircleFilled size=14) filledoutlinedmarkers;
xaxis grid; 
yaxis grid;
run;

The graph now uses a color model that assigns colors based on the range 10 to 18. For these data, the ages are in the interval [11, 16]. Accordingly, there are not dark blue or dark red markers. The colors range from green to orange. If you use the AgeID map to plot other data samples, they will all use the same color scheme regardless of the ages that are in the data.

Before leaving this example, notice that sometimes the highest or lowest tick mark is not drawn on the gradient legend. You can force the extreme tick marks to display by extending the tick range by a tiny amount. For example, if you use 9.999 and 18.001 as the minimum and maximum values of the range, then the extreme tick marks are shown. The example in the next section uses this trick.

Automating the creation of a range attribute map

The manual specification of the range attribute map is straightforward, but we can add some SAS macro magic to make it more flexible and reusable. First, note that you can use the COUNTW function to count the number of colors in a space-separated list of colors, so you do not need to manually specify the length of the AltColorModel array. Furthermore, you can use the SCAN function to extract each color in a list, so you don't need the DATALINES statement. The following DATA step creates the same range attribute map as the previous section, but generates it from a space-separated list of colors:

/* Second method: create a range attr map that assigns colors in [10,18] 
   Assume the colors are space-separated.
*/ 
%let ColorRamp = CX3288BD CX99D594 CXE6F598 CXFEE08B CXFC8D59 CXD53E4F;
%let NumColors = %sysfunc(countw(&ColorRamp));
data AgeRangeAttrMap;
length ID $20;
length min max $12;    /* Note: Using CHARACTER vars for min/max */
min = "10.0";
max = "18.0";
array AltColorModel[&NumColors] $32;  /* use AltColorModel array for MARKERS */
do _i = 1 to &NumColors;
   AltColorModel[_i] = scan("&ColorRamp", _i);
end;
drop _i;
run;
 
proc print noobs; run;

The output is not shown but is identical to the previous hard-coded map.

With a little effort, you can write a SAS macro that generates a range attribute map from a space-separated list of colors, a minimum and maximum value, and the names of the data set and map. So that the map can be used for both markers and areas, you can define an AltColorModel array and a ColorModel array, as follows:

/* Third and most flexible method: Create a range attribute map from the following parameters:
   ColorRamp : a space-separated list of colors, such as 
               CX3288BD CX99D594 CXE6F598
               or
               Red White Blue
   DSName : The name of the data set that contains the map. Use this value for RATTRMAP= option.
   MapName : The name of the range attribute map. Use this value for RATTRID= option.
   minRange: The minimum value of the range or a valid keyword.
   maxRange: The maximum value of the range or a valid keyword.
*/
%macro MakeRangeAttrMap(ColorRamp, DSName, MapName, minRange, maxRange);
   %let NumColors = %sysfunc(countw(&ColorRamp));
   data &DSName;
   length ID $20;
   length min max $12;    /* Note: Using CHARACTER vars for min/max */
   retain ID "&MapName";
   min = "&minRange";
   max = "&maxRange";
   array AltColorModel[&NumColors] $32;
   array    ColorModel[&NumColors] $32;
   do _i = 1 to &NumColors;
      AltcolorModel[_i] = scan("&ColorRamp", _i);  /* used for markers */
         ColorModel[_i] = AltcolorModel[_i];       /* used for areas */
   end;
   drop _i;
   run;
%mend;
 
%MakeRangeAttrMap(&ColorRamp, AgeRangeAttrMap, AgeID, 10, 18);

The AgeID map in the AgeRangeAttrMap data set has the same values as before for the AltColorModeln variables, but the new data set also includes ColorModeln variables, which you can use to assign colors for bars, heat maps, polygons, and so forth.

Range attribute map for area elements

A good example that requires a range attribute map is the visualization of a correlation matrix. If you do not use a range attribute map, then colors are assign based on the sample correlations in the data. In many cases, it is better to set the range of colors to be [-1, 1] with a neutral color (white or gray) at 0. That way, it is easy to see at a glance which pairs of variables have negative correlation, approximately zero correlation, or positive correlation.

To demonstrate, the following call to PROC CORR estimates the pairwise correlations for 10 numeric variables in the Sashelp.cars data set. As shown in a previous article, you can use the FISHER option to output the pairwise correlations, as follows:

/* create data set that contains pairwise correlations in long form. See
   https://blogs.sas.com/content/iml/2022/09/26/correlations-to-list.html */
ods select none;
proc corr data=Sashelp.Cars nomiss noprob  FISHER;  /* FISHER ==> list of Pearson correlations */
   var _numeric_;
   ods output FisherPearsonCorr=CorrList(
              keep=Var WithVar Corr
              rename=(Var=Var1 WithVar=Var2));       /* Optional: Put the correlations in a data set */
run;
ods select all;

The output is not shown, but it is in the correct format to draw a heat map. The following statements specify a Brown-to-BlueGreen color model and create a range attribute map in which the color range is set to [-1, 1]. Actually, I defined the color range to be slightly WIDER so that the tick marks at -1 and +1 are displayed:

/* define Brown-to-BlueGreen color model and define range attribute map for range [-1, 1] */
%let BrBgRamp = CX8C510A CXD8B365 CXF6E8C3 CXF5F5F5 CXC7EAE5 CX5AB4AC CX01665E ;
%MakeRangeAttrMap(&BrBgRamp, CorrRangeAttrMap, CorrID, -1.001, 1.001);
 
title "Heat Map of Correlation Matrix";
title2 "Set Range of Color Ramp Range to [-1, 1]";
proc sgplot data=CorrList aspect=1                rattrmap=CorrRangeAttrMap;
   heatmapparm x=Var1 y=Var2 colorresponse=Corr / rattrID=CorrID
               outline outlineattrs=(color=grey);
   yaxis reverse display=(nolabel);
   xaxis display=(nolabel);
run;

For this graph, the colors for the cells in the heat map are controlled by using the ColorModeln variables. Notice that I display only the lower triangular portion of the correlation matrix because the matrix is symmetric.

Summary

This article shows how to define a range attribute map in SAS. By default, colors are mapped to the range of the data that you are graphing. However, it can be useful to map colors to a range that is independent of the data. Two examples are shown. One is a scatter plot that maps the colors of markers to an interval that is independent of the data. The other is a heat map of a correlation matrix. The colors are mapped to the interval [-1, 1], which ensures that a consistent set of colors are used for very negative correlations, nearly zero correlations, and very positive correlations regardless of the statistics being displayed.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Leave A Reply

Back to Top