Compatible plot types in SAS

1

When the SAS statistical graphics (SG) procedures were designed in the early 2000s, a goal was to create a comprehensive Graph Template Language (GTL) and leverage the GTL by using SG procedures that perform common tasks easily without having to write any GTL. This project was hugely successful, and "ODS graphics" and "SG procedures" quickly became the primary way to create graphs in SAS. Among SG procedure, the workhorse procedure is PROC SGPLOT, which enables you to overlay different graphical elements (markers, lines, legends, insets, reference lines, text, polygons, ...) on the same graph. At times, it seems like PROC SGPLOT can create any graph you can think of. Creative SAS programmers have used the SGPLOT procedure to create both serious and not-so-serious graphs, including whimsical graphics to celebrate holidays such as Valentine's Day, Easter, Thanksgiving, Christmas, and more.

However, experienced SAS programmers know that PROC SGPLOT is not omnipotent. It can overlay only statements that are compatible with each other. To simplify the syntax of PROC SGPLOT, graphs are categorized into four plot types: basic plots, fit and confidence plots, distribution plots, and categorization plots. Each SGPLOT statement is "compatible" with certain other statements, and only compatible features can be overlaid in a single plot. For example, the DOT statement is a categorization statement and can be combined only with HBAR, HBARBASIC, and HLINE statements. In contrast, the SCATTER statement is a basic plot and can be combined with most plots (except for categorization plots). The REFLINE statement is super-compatible and can be overlaid on any plot (but you might need to specify the formatted values of a discrete axis).

In most situations, these rules do not present an impediment. The rules prevent you from doing something that (usually) doesn't make sense. For example, it doesn't make mathematical sense to overlay a regression line on a bar chart. A regression line assumes that X and Y are continuous variables, whereas a bar chart is a visual summary of the counts for a categorical variable. Thus, the REG and VBAR statements are not compatible. If you attempt to specify the REG and VBAR statements in the same call to PROC SGPLOT, an error is displayed in the log:

ERROR: Attempting to overlay incompatible plot or chart types.

Some people claim that rules are meant to be broken. I don't necessarily agree with that adage, but there have been occasions when PROC SGPLOT would not let me overlay certain plots, and I had to work around the restriction. The following links provide workarounds for some situations in which you want to combine certain statements in PROC SGPLOT:

  • Overlay a custom density on a histogram. The HISTOGRAM statement and the SERIES statement cannot coexist in PROC SGPLOT. However, you can [LINK] write a GTL template that overlays the histogram and a custom density estimate. Note that if you want to overlay a normal curve or a kernel density estimate, you can use the DENSITY statement in PROC SGPLOT, which does not require GTL.
  • Overlay a curve and a histogram. Recently, KSharp posted a clever technique on the SAS Support Communities that enables you to overlay a curve and a histogram. His technique does not require using GTL. I will blog about KSharp's method in a future article.
  • Overlay a continuous curve on a bar chart. In most situations it doesn't make sense to overlay a continuous curve on a discrete bar chart. However, there is a canonical example in elementary statistics that combines continuous and discrete data: the normal approximation to the binomial distribution. In these and other situations, you can use a simple workaround: use the VBARBASIC or HBARBASIC statements to overlay a curve (or another "basic" plot) on a bar chart.
  • Overlay a other basic plots on a bar chart. By using the same technique, you can overlay other graphical elements, such as a highlow plot.

As you can see from this list, the situations in which I encounter "incompatible plot or chart types" are related to overlaying basic plots on bar charts and histograms. By using the VBARBASIC or HBARBASIC statements, you can overcome the issue for bar charts. Histograms are trickier, and I usually use GTL for full control. However, it is possible to workaround the issue by replacing the HISTOGRAM statement with a HIGHLOW statement, as shown in the next article.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

1 Comment

Leave A Reply

Back to Top