Aborting a SAS/IML program upon encountering an error

A colleague sent me an interesting question: What is the best way to abort a SAS/IML program? For example, you might want to abort a program if the data is singular or does not contain a sufficient number of observations or variables. As a first attempt would be to try […]
Post a Comment

How to automatically select a smooth curve for a scatter plot in SAS

My last blog post described three ways to add a smoothing spline to a scatter plot in SAS. I ended the post with a cautionary note: From a statistical point of view, the smoothing spline is less than ideal because the smoothing parameter must be chosen manually by the user. […]
Post a Comment

13 popular articles from 2013

In 2013 I published 110 blog posts. Some of these articles were more popular than others, often because they were linked to from a SAS newsletter such as the SAS Statistics and Operations Research News. In no particular order, here are some of my most popular posts from 2013, organized […]
Post a Comment

How to specify mosaic plot colors in SAS

The mosaic plot is a graphical visualization of a frequency table. In a previous post, I showed how to use the FREQ procedure to create a mosaic plot. This article shows how to create a mosaic plot by using the MOSAICPARM statement in the graph template language (GTL). (The MOSAICPARM […]
Post a Comment

Create mosaic plots in SAS by using PROC FREQ

Mosaic plots (Hartigan and Kleiner, 1981; Friendly, 1994, JASA) are used for exploratory data analysis of categorical data. Mosaic plots have been available for decades in SAS products such as JMP, SAS/INSIGHT, and SAS/IML Studio. However, not all SAS customers have access to these specialized products, so I am pleased […]
Post a Comment

How to order categories in a two-way table with PROC FREQ

If you've ever tried to use PROC FREQ to create a frequency table of two character variables, you know that by default the categories for each variable are displayed in alphabetical order. A different order is sometimes more useful. For example, consider the following two-way table for the smoking status […]
Post a Comment

Output percentiles of multiple variables in a tabular format

A challenge for statistical programmers is getting data into the right form for analysis. For graphing or analyzing data, sometimes the "wide format" (each subject is represented by one row and many variables) is required, but other times the "long format" (observations for each subject span multiple rows) is more […]
Post a Comment

Claiming diversity? Compare with the population!

On Kaiser Fung's Junk Charts blog, he showed a bar chart that was "published by Teach for America, touting its diversity." Kaiser objected to the chart because the bar lengths did not accurately depict the proportions of the Teach for America corps members. The chart bothers me for another reason: […]
Post a Comment

Two applications of the "runs test"

In my last blog post I described how to implement a "runs test" in the SAS/IML language. The runs test determines whether a sequence of two values (for example, heads and tails) is likely to have been generated by random chance. This article describes two applications of the runs test. […]
Post a Comment

How to tell whether a sequence of heads and tails is random

While walking in the woods, a statistician named Goldilocks wanders into a cottage and discovers three bears. The bears, being hungry, threaten to eat the young lady, but Goldilocks begs them to give her a chance to win her freedom. The bears agree. While Mama Bear and Papa Bear block […]
Post a Comment