## Tag: Statistical Graphics

0
Truncate response surfaces

An analyst was using SAS to analyze some data from an experiment. He noticed that the response variable is always positive (such as volume, size, or weight), but his statistical model predicts some negative responses. He posted the data and asked if it is possible to modify the graph so

0
Use PROC BOXPLOT to display hundreds of box plots

A previous article shows how to use a scatter plot to visualize the average SAT scores for all high schools in North Carolina. The schools are grouped by school districts and ranked according to the median value of the schools in the district. For the school districts that have many

0
Visualize SAT scores in North Carolina

Standardized tests like the SAT and ACT can cause stress for both high school students and their parents, but according to a Wall Street Journal article, the SAT and ACT "provide an invaluable measure of how students are likely to perform in college and beyond." Naturally, students wonder how their

0
3 ways to create nested box plots in SAS

Box plots are a great way to compare the distributions of several subpopulations of your data. For example, box plots are often used in clinical studies to visualize the response of patients in various cohorts. This article describes three techniques to visualize responses when the cohorts have a nested or

0
How to align the Y and Y2 axes in PROC SGPLOT

When you overlay two series in PROC SGPLOT, you can either plot both series on the same axis or you can assign one series to the main axis (Y) and another to a secondary axis (Y2). If you use the Y and Y2 axes, they are scaled independently by default,

0
10 posts from 2018 that deserve a second look

Numbers don't lie, but sometimes they don't reveal the full story. Last week I wrote about the most popular articles from The DO Loop in 2018. The popular articles are inevitably about elementary topics in SAS programming or statistics because those topics have broad appeal. However, I also write about

0
Top posts from The DO Loop in 2018

Last year, I wrote more than 100 posts for The DO Loop blog. Of these, the most popular articles were about data visualization, SAS programming tips, and statistical data analysis. Here are the most popular articles from 2018 in each category. Data Visualization Visualize repetition in song lyrics: In one

0
Visualize a mixed model that has repeated measures or random coefficients

I regularly see questions on a SAS discussion forum about how to visualize the predicted values for a mixed model that has at least one continuous variable, a categorical variable, and possibly an interaction term. SAS procedures such as GLM, GENMOD, and LOGISTIC can automatically produce plots of the predicted

0
5 tips for customizing legends in PROC SGPLOT in SAS

When a graph includes several markers or line styles, it is often useful to create a legend that explains the relationship between the data and the symbols, color, and line styles in the graph. The SGPLOT procedure does a good job of automatically creating and placing a legend for most

0
A funnel plot for immunization rates

Last week my colleague, Robert Allison, visualized data regarding immunization rates for kindergarten classes in North Carolina. One of his graphs was a scatter plot that displayed the proportion of unimmunized students versus the size of the class for 1,885 kindergarten classes in NC. This scatter plot is the basis

0
Define custom color ramps by using the RANGEATTRMAP statement in GTL

The SGPLOT procedure enables you to use the value of a response variable to color markers or areas in a graph. For example, you can use the COLORRESPONSE= option to define a variable whose values will be used to color markers in a scatter plot or cells in a heat

Programming Tips
0
Visualize the feasible region for a constrained optimization

When solving optimization problems, it is harder to specify a constrained optimization than an unconstrained one. A constrained optimization requires that you specify multiple constraints. One little typo or a missing minus sign can result in an infeasible problem or a solution that is unrelated to the true problem. This

0
A trick to plot groups in PROC SGPLOT

A useful feature in PROC SGPLOT is the ability to easily visualize subgroups of data. Most statements in the SGPLOT procedure support a GROUP= option that enables you to overlay plots of subgroups. When you use the GROUP= option, observations are assigned attributes (colors, line patterns, symbols, ...) that indicate

Programming Tips
0
Chi-square tests for proportions in one-way tables

Programmers on a SAS discussion forum recently asked about the chi-square test for proportions as implemented in PROC FREQ in SAS. One person asked the basic question, "how do I test the null hypothesis that the observed proportions are equal to a set of known proportions?" Another person said that

Programming Tips
0
Distances on rectangular grids

Given a rectangular grid with unit spacing, what is the expected distance between two random vertices, where distance is measured in the L1 metric? (Here "random" means "uniformly at random.") I recently needed this answer for some small grids, such as the one to the right, which is a 7 x 6

0
Plot curves for levels of two categorical variables in SAS

The SGPLOT procedure in SAS makes it easy to create graphs that overlay various groups in the data. Many statements support the GROUP= option, which specifies that the graph should overlay group information. For example, you can create side-by-side bar charts and box plots, and you can overlay multiple scatter

0
How to score and graph a quantile regression model in SAS

This article shows how to score (evaluate) a quantile regression model on new data. SAS supports several procedures for quantile regression, including the QUANTREG, QUANTSELECT, and HPQUANTSELECT procedures. The first two procedures do not support any of the modern methods for scoring regression models, so you must use the "missing

Data Visualization
0
Ranking US presidents

Which president of the United States is ranked the greatest by presidential historians? This article visualizes the results of the 2018 Presidential Greatness Survey, which was created and administered by B. Rottinghaus and J. Vaughn. They analyzed 166 responses from experts in political science who ranked the 44 US presidents

0
Attrs, attrs, everywhere: The interaction between ATTRPRIORITY, CYCLEATTRS, and STYLEATTRS in ODS graphics

If you use PROC SGPLOT to create ODS graphics, "ATTRS" are everywhere. ATTRS is an abbreviation of "attributes." Most options that change the attributes of a graphical element end with the ATTRS suffix. For example, the MARKERATTRS option modifies attributes of markers, the LINEATTRS option modifies attributes of lines, and

0
Use a fringe plot to visualize binary data in logistic models

A previous article showed how to use a calibration plot to visualize the goodness-of-fit for a logistic regression model. It is common to overlay a scatter plot of the binary response on a predicted probability plot (below, left) and on a calibration plot (below, right): The SAS program that creates

Data Visualization
0
A butterfly plot for comparing distributions

This article shows how to construct a butterfly plot in SAS. A butterfly plot (also called a butterfly chart) is a comparative bar chart or histogram that displays the distribution of a variable for two subpopulations. A butterfly plot for the cholesterol readings of 5,057 patients in a medical study

0
Decile calibration plots in SAS

In my article about how to construct calibration plots for logistic regression models in SAS, I mentioned that there are several popular variations of the calibration plot. The previous article showed how to construct a loess-based calibration curve. Austin and Steyerberg (2013) recommend the loess-based curve on the basis of

0
Calibration plots in SAS

A logistic regression model is a way to predict the probability of a binary response based on values of explanatory variables. It is important to be able to assess the accuracy of a predictive model. This article shows how to construct a calibration plot in SAS. A calibration plot is

0
Order variables in a heat map or scatter plot matrix

Order matters. When you create a graph that has a categorical axis (such as a bar chart), it is important to consider the order in which the categories appear. Most software defaults to alphabetical order, which typically gives no insight into how the categories relate to each other. Alphabetical order

Data Visualization
0
The 80-20 rule for blogs

You've probably heard about the "80-20 Rule," which describes many natural and manmade phenomena. This rule is sometimes called the "Pareto Principle" because it was discovered by Vilfredo Pareto (1848–1923) who used it to describe the unequal distribution of wealth. Specifically, in his study, 80% of the wealth was held

0
A zipper plot for visualizing coverage probability in simulation studies

Simulation studies are used for many purposes, one of which is to examine how distributional assumptions affect the coverage probability of a confidence interval. This article describes the "zipper plot," which enables you to compare the coverage probability of a confidence interval when the data do or do not follow

Data Visualization
0
Merged legends: Overlay a symbol and line in a legend item

Did you know that SAS can combine or "merge" a symbol and a line pattern into a single legend item, as shown below? This kind of legend is useful when you are overlaying a group of curves onto a scatter plot. It enables the reader to quickly associate values of

Data Visualization
0
Create a stacked band plot in SAS

This article shows how to construct a "stacked band plot" in SAS, as shown to the right. (Click to enlarge.) You are probably familiar with a stacked bar chart in which the cumulative amount of some quantity is displayed by stacking the contributions of several groups. A canonical example is

Data Visualization
0
How much does it cost to give birth in the US?

Money magazine (Jan/Feb 2018) contains an article about how much it costs to give birth in the US. The costs, which are based on insurance data, include prenatal care and hospital delivery but exclude infant care. The data are compiled for each state (including Washington, DC) and by type of

0
10 posts from 2017 that deserve a second look

Last week I wrote about the 10 most popular articles from The DO Loop in 2017. My most popular articles tend to be about elementary statistics or SAS programming tips. Less popular are the articles about advanced statistical and programming techniques. However, these technical articles fill an important niche. Not

1 2 3 6