## Tag: Statistical Graphics

0
3 ways to visualize prediction regions for classification problems

An important problem in machine learning is the "classification problem." In this supervised learning problem, you build a statistical model that predicts a set of categorical outcomes (responses) based on a set of input features (explanatory variables). You do this by training the model on data for which the outcomes

Programming Tips
0
On the SMOOTHCONNECT option in the SERIES statement

By default, when you use the SERIES statement in PROC SGPLOT to create a line plot, the observations are connected (in order) by straight line segments. However, SAS 9.4m1 introduced the SMOOTHCONNECT option which, as the name implies, uses a smooth curve to connect the observations. In Sanjay Matange's blog,

Data Visualization
0
Perceptions of probability

If a financial analyst says it is "likely" that a company will be profitable next year, what probability would you ascribe to that statement? If an intelligence report claims that there is "little chance" of a terrorist attack against an embassy, should the ambassador interpret this as a one-in-a-hundred chance,

0
Visualize a design matrix

Most SAS regression procedures support a CLASS statement which internally generates dummy variables for categorical variables. I have previously described what dummy variables are and how are they used. I have also written about how to create design matrices that contain dummy variables in SAS, and in particular how to

0
Visualize an ANOVA with two-way interactions

There are several ways to visualize data in a two-way ANOVA model. Most visualizations show a statistical summary of the response variable for each category. However, for small data sets, it can be useful to overlay the raw data. This article shows a simple trick that you can use to

0
Visualize the 68-95-99.7 rule in SAS

A reader commented on last week's article about constructing symmetric intervals. He wanted to know if I created it in SAS. Yes, the graph, which illustrates the so-called 68-95-99.7 rule for the normal distribution, was created by using several statements in the SGPLOT procedure in Base SAS The SERIES statement

Programming Tips
0
What colors does PROC SGPLOT use for markers?

Suppose you create a scatter plot in SAS with PROC SGPLOT. What color does PROC SGPLOT use for the markers? If you specify the GROUP= option so that markers are colored by a grouping variable, what colors are used to represent the various groups? The following scatter plot shows the

0
Automate the creation of a discrete attribute map

If you are a SAS programmer and use the GROUP= option in PROC SGPLOT, you might have encountered a thorny issue: if you use a WHERE clause to omit certain observations, then the marker colors for groups might change from one plot to another. This happens because the marker colors

0
Is "La Quinta" Spanish for "Next to Denny's"?

“La Quinta” is Spanish for “next to Denny’s.”      -- Mitch Hedberg, comedian Mitch Hedberg's joke resonates with travelers who drive on the US interstate system because many highway exits feature both a La Quinta Inn™ and a Denny's® restaurant within a short distance of each other. But does a

0
Append data to add markers to SAS graphs

Do you want to create customized SAS graphs by using PROC SGPLOT and the other ODS graphics procedures? An essential skill that you need to learn is how to merge, join, append, and concatenate SAS data sets that come from different sources. The SAS statistical graphics procedures (SG procedures) enable

0
Highlight forecast regions in graphs

A SAS customer asked how to use background colors and a dashed line to emphasize the forecast region for a graph that shows a time series model. The task requires the following steps: Use the ATTRPRIORITY=NONE option on the ODS GRAPHICS statement to make sure that the current ODS style

0
Visualize the ages of US presidents

Who was the oldest person elected president of the United States? How about the youngest? Who was the oldest when he left office? Let's look at some data. Wikipedia has a page that presents a table of the presidents of the US by age. It lists the dates for which

0
Visualize a torus in SAS

This article uses graphical techniques to visualize one of my favorite geometric objects: the surface of a three-dimensional torus. Along the way, this article demonstrates techniques that are useful for visualizing more mundane 3-D point clouds that arise in statistical data analysis. Define points on a torus A torus is

0
Rotation matrices and 3-D data

Rotation matrices are used in computer graphics and in statistical analyses. A rotation matrix is especially easy to implement in a matrix language such as the SAS Interactive Matrix Language (SAS/IML). This article shows how to implement three-dimensional rotation matrices and use them to rotate a 3-D point cloud. Define

0
Ahh, that's smooth! Anti-aliasing in SAS statistical graphics

I've written several articles about scatter plot smoothers: nonparametric regression curves that reveal small- and large-scale features of a response variable as a function of an explanatory variable. However, there is another kind of "smoothness" that you might care about, and that is the apparent smoothness of curves and markers

0
Let PROC FREQ create graphs of your two-way tables

The recent releases of SAS 9.4 have featured major enhancements to the ODS statistical graphics procedures such as PROC SGPLOT. In fact, PROC SGPLOT (and the underlying Graph Template Language (GTL)) are so versatile and powerful that you might forget to consider whether you can create a graph automatically by

0
Overlay a curve on a bar chart in SAS

One of the strengths of the SGPLOT procedure in SAS is the ease with which you can overlay multiple plots on the same graph. For example, you can easily combine the SCATTER and SERIES statements to add a curve to a scatter plot. However, if you try to overlay incompatible

0
Graph a step function in SAS

Last week I wrote about how to compute sample quantiles and weighted quantiles in SAS. As part of that article, I needed to draw some step functions. Recall that a step function is a piecewise constant function that jumps by a certain amount at a finite number of points. Graph

0
Create an animation with the BY statement in PROC SGPLOT

It is easy to use PROC SGPLOT and BY-group processing to create an animated graph in SAS 9.4. Sanjay Matange previously discussed how to create an animated plot in SAS 9.4, but he used a macro loop to call PROC SGPLOT many times. It is often easier to use the

0
Female world leaders by year of election

This week Hillary Clinton became the first woman to be nominated for president of the US by a major political party. Although this is a first for the US, many other countries have already passed this milestone. In fact, 60 countries have already elected women as presidents and prime ministers.

0
Color markers in a scatter plot by a third variable in SAS

One of my favorite new features in PROC SGPLOT in SAS 9.4m2 is addition of the COLORRESPONSE= and COLORMODEL= options to the SCATTER statement. By using these options, it is easy to color markers in a scatter plot so that the colors indicate the values of a continuous third variable.

0
In praise of simple graphics

'Tis a gift to be simple. -- Shaker hymn In June 2015 I published a short article for Significance, a magazine that features statistical and data-related articles that are of general interest to a wide a range of scientists. The title of my article is "In Praise of Simple Graphics."

0
Overlay plots on a box plot in SAS: Continuous X axis

I have previously shown how to overlay basic plots on box plots when all plots share a common discrete X axis. It is interesting to note that box plots can also be overlaid on a continuous (interval) axis. You often need to bin the data before you create the plot.

0
Overlay plots on a box plot in SAS: Discrete X axis

Box plots summarize the distribution of a continuous variable. You can display multiple box plots in a single graph by specifying a categorical variable. The resulting graph shows the distribution of subpopulations, such as different experimental groups. In the SGPLOT procedure, you can use the CATEGORY= option on the VBOX

0
Lasagna plots in SAS: When spaghetti plots don't suffice

Last week I discussed how to create spaghetti plots in SAS. A spaghetti plot is a type of line plot that contains many lines. Spaghetti plots are used in longitudinal studies to show trends among individual subjects, which can be patients, hospitals, companies, states, or countries. I showed ways to

0
Create spaghetti plots in SAS

What is a spaghetti plot? Spaghetti plots are line plots that involve many overlapping lines. Like spaghetti on your plate, they can be hard to unravel, yet for many analysts they are a delicious staple of data visualization. This article presents the good, the bad, and the messy about spaghetti

0
How much do New Yorkers tip taxi drivers?

When I read Robert Allison's article about the cost of a taxi ride in New York City, I was struck by the scatter plot (shown at right; click to enlarge) that plots the tip amount against the total bill for 12 million taxi rides. The graph clearly reveals diagonal and

0
Set attributes of markers in PROC SGPLOT by using ODS style elements

The SG procedures in SAS use aesthetically pleasing default colors, shapes, and styles, but sometimes it is necessary to override the default attributes. The MARKERATTRS= option enables you to override the default colors, symbols, and sizes of markers in scatter plots and other graphs. Similarly, the LINEATTRS= option enables you

0
Comparative histograms: Panel and overlay histograms in SAS

You can use histograms to visualize the distribution of data. A comparative histogram enables you to compare two or more distributions, which usually represent subpopulations in the data. Common subpopulations include males versus females or a control group versus an experimental group. There are two common ways to construct a

1 2 3 5