Over the past month or more, I have been in a conversation with SAS user James Marcus, on creation of some new displays for visual communication of uncertainty. These include display of densities using a "Violin" plot, "Density Strips" and more. With his permission, I can share some of the results over the next few articles.
At the onset, I should state that my interest is from a graphical perspective, with a desire to see how these graphs can be created using ODS Graphics. I do not claim knowledge of the statistical aspects. The code to determine the density values by category was provided by James Marcus.
In this article, I will cover creating a Violin Plot (Hintze and Nelson, 1998). We used the sashelp.heart data set, to create violin plots of the cholesterol densities by death cause. The density values are computed using proc KDE.
Here is the graph created using the SGPANEL procedure. Click on the graph for a bigger image.
The key steps to create this graph are as follows:
- The cholesterol densities are computed by death cause using proc KDE.
- A violin plot essentially "mirrors" the data to create a closed shape.
- Since the classification is discrete, we used the SGPANEL procedure.
- We used various procedure options to get the look above.
SAS 9.2 SGPANEL Code:
title 'Violin Plot of Cholesterol Densities by Death Cause'; proc sgpanel data=chol_den_2 nocycleattrs; panelby deathcause / layout=columnlattice onepanel novarname noborder colheaderpos=bottom; band y=cholesterol upper=density lower=mirror / fill outline; rowaxis label='Cholesterol' grid; colaxis display=none; run;
Key features of proc SGPANEL used are:
- PanelBy Death cause with LAYOUT=COLUMNLATTICE to create a lattice of columns.
- Suppress display of panel variable name in each cell header (NOVARNAME).
- Suppress display of cell borders (NOBORDER).
- Place column headers at the bottom (COLHEADERPOS).
- A band plot is used to draw the violin shape in each cell.
- Axis display is reduced to focus on the shape of the data.
- Default uniform scaling allows comparison across the panels.
Horizontal violin plots can be also be created using Layout=ROWLATTICE:
Here is a version using HighLow plots to show the data as histogram bins:
While the closed shape of the violin provides a satisfactory visual by the Gestalt principles, it does use up double the space. A "Half-Violin" graph (essentially band plot or HighLow plot with zero value on one side) can use the space more efficiently:
The full code for the graphs above is attached below.
SAS 9.2 Program for Violin Plot: Full SAS Code_92
James has further enhanced the graph to include quantile ranges and mean or median markers as shown below:
Full SAS 9.2 Code from James Marcus: Violin_Overlay_92