At a recent conference, I talked with a SAS customer who told me that he was using an R package to create a three-panel visualization of a distribution. Unfortunately, he couldn't remember the name of the package, and he has not returned my e-mails, so the purpose of today's article

## Tag: **Statistical Graphics**

I've previously described how to overlay two or more density curves on a single plot. I've also written about how to use PROC SGPLOT to overlay custom curves on a graph. This article describes how to overlay a density curve on a histogram. For common distributions, you can overlay a

I recently showed someone a trick to create a graph, and he was extremely pleased to learn it. The trick is well known to many SAS users, but I hope that this article will introduce it to even more SAS users. At issue is how to use the SGPLOT procedure

Did you know that your ODS style might result in changing the color ramp for contour plots and heat maps? For example, the default style in SAS 9.3 is HTMLBlue. Let's create a contour plot in the HTML destination by running an example adapted from the documentation for the RSREG

It is easy to use the SGPLOT procedure in SAS to plot the graph of a well-behaved continuous function: just create a data set of the (x,y) values on some domain and use the SERIES statement to connect the points. However, to plot the graph of a discontinuous function correctly

A SAS user asked an interesting question on the SAS/GRAPH and ODS Graphics Support Forum. The question is: Does PROC SGPLOT support a way to display the slope of the regression line that is computed by the REG statement? Recall that the REG statement in PROC SGPLOT fits and displays

When a categorical variable has dozens or hundreds of categories, it is often impractical and undesirable to create a bar chart that shows the counts for all categories. Two alternatives are popular: Display only the Top 10 or Top 20 categories. As I showed last week, to do this in

Sometimes a categorical variable has many levels, but you are only interested in displaying the levels that occur most frequently. For example, if you are interested in the number of times that a song was purchased on iTunes during the past week, you probably don't want a bar chart with

It seemed like an easy task. A SAS user asked me how to use the SGPLOT procedure to create a bar chart where the vertical axis shows percentages instead of counts. I assumed that there was some simple option that would change the scale of the vertical axis from counts

What's in a name? As Shakespeare's Juliet said, "That which we call a rose / By any other name would smell as sweet." A similar statement holds true for the names of colors in SAS: "Rose" by any other name would look as red! SAS enables you to specify a

Sometimes a graph is more interpretable if you assign specific colors to categories. For example, if you are graphing the number of Olympic medals won by various countries at the 2012 London Olympics, you might want to assign the colors gold, silver, and bronze to represent first-, second-, and third-place

The New York Times has an excellent staff that produces visually interesting graphics for the general public. However, because their graphs need to be understood by all Times readers, the staff sometimes creates a complicated infographic when a simpler statistical graph would show the data in a clearer manner. A

With the US presidential election looming, all eyes are on the Electoral College. In the presidential election, each state gets as many votes in the Electoral College as it has representatives in both congressional houses. (The District of Columbia also gets three electors.) Because every state has two senators, it

Robert Allison posted a map that shows the average commute times for major US cities, along with the proportion of the commute that is attributed to traffic jams and other congestion. The data are from a CEOs for Cities report (Driven Apart, 2010, p. 45). Robert use SAS/GRAPH software to

The other day I was using PROC SGPLOT to create a box plot and I ran a program that was similar to the following: proc sgplot data=sashelp.cars; title "Box Plot: Category = Origin"; vbox Horsepower / category=origin; run; An hour or so later I had a need for another box

A comment to last week's article on "How to get data values out of ODS graphics" indicated that the technique would be useful for changing the title on an ODS graph "without messing around with GTL." You can certainly use the technique for that purpose, but if you want to

Many SAS procedures can produce ODS statistical graphics as naturally as they produce tables. Did you know that it is possible to obtain the numbers underlying an ODS statistical graph? This post shows how. Suppose that a SAS procedure creates a graph that displays a curve and that you want

When you are working with probability distributions (normal, Poisson, exponential, and so forth), there are four essential functions that a statistical programmer needs. As I've written before, for common univariate distributions, SAS provides the following functions: the PDF function, which returns the probability density at a given point the CDF

I've been working on a new book about Simulating Data with SAS. In researching the chapter on simulation of multivariate data, I've noticed that the probability density function (PDF) of multivariate distributions is often specified in a matrix form. Consequently, the multivariate density can usually be computed by using the

When I need to graph a function of two variables, I often choose to use a contour plot. A surface plot is probably easier for many people to understand, but it has several disadvantages when compared to a contour plot. For example, the following statements in SAS/IML Studio displays a

Last week I discussed how to fit a Poisson distribution to data. The technique, which involves using the GENMOD procedure, produces a table of some goodness-of-fit statistics, but I find it useful to also produce a graph that indicates the goodness of fit. For continuous distributions, the quantile-quantile (Q-Q) plot

Over at the SAS and R blog, Ken Kleinman discussed using polar coordinates to plot time series data for multiple years. The time series plot was reproduced in SAS by my colleague Robert Allison. The idea of plotting periodic data on a circle is not new. In fact it goes

Some SAS products such as SAS/IML Studio (which is included FREE as part of SAS/IML software) have interactive graphics. This makes it easy to interrogate a graph to determine values of "hidden" variables that might not appear in the graph. For example, in a scatter plot in SAS/IML Studio, you

Recently the "SAS Sample of the Day" was a Knowledge Base article with an impressively long title: Sample 42165: Using a stored process to eliminate duplicate values caused by multiple group memberships when creating a group-based, identity-driven filter in SAS® Information Map Studio "Wow," I thought. "This is the longest

I have previously written about how to create funnel plots in SAS software. A funnel plot is a way to compare the aggregated performance of many groups without ranking them. The groups can be states, counties, schools, hospitals, doctors, airlines, and so forth. A funnel plot graphs a performance metric

Sometimes you want to label only certain observations in a plot. This is useful in many ways, but one use is to label outliers on a scatter plot. In the SGPLOT procedure, the DATALABEL= option enables you to specify the name of a variable that is used to label observations.

I've noticed that a lot of people want to be able to draw bar charts with confidence intervals. This topic is a frequent posting on the SAS/GRAPH and ODS Graphics Discussion Forum and on the SAS-L mailing list. Consequently, this post describes how to add errors bars to a bar

Do you know someone who has a birthday in mid-September? Odds are that you do: the middle of September is when most US babies are born, according to data obtained from the National Center for Health Statistics (NCHS) Web site (see Table 1-16). There's an easy way to remember this

My elderly mother enjoys playing Scrabble®. The only problem is that my father and most of my siblings won't play with her because she beats them all the time! Consequently, my mother is always excited when I visit because I'll play a few Scrabble games with her. During a recent

Exploring correlation between variables is an important part of exploratory data analysis. Before you start to model data, it is a good idea to visualize how variables related to one another. Zach Mayer, on his Modern Toolmaking blog, posted code that shows how to display and visualize correlations in R.