Blogs

Blogs

Author

Rick Wicklin

Rick Wicklin RSS
Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Analytics | Data Visualization

Rick WicklinFebruary 7, 2018 0

The distribution of shared birthdays in the Birthday Problem

If N random people are in a room, the classical birthday problem provides the probability that at least two people share a birthday. The birthday problem does not consider how many birthdays are in common. However, a generalization (sometimes called the Multiple-Birthday Problem) examines the distribution of the number of

Read More

Programming Tips

Rick WicklinFebruary 5, 2018 0

Simulate the birthday-matching problem

This article simulates the birthday-matching problem in SAS. The birthday-matching problem (also called the birthday problem or birthday paradox) answers the following question: "if there are N people in a room, what is the probability that at least two people share a birthday?" The birthday problem is famous because the

Read More

Data Visualization

A stacked band plot, created in SAS by using PROC SGPLOT

Rick WicklinJanuary 31, 2018 0

Create a stacked band plot in SAS

This article shows how to construct a "stacked band plot" in SAS, as shown to the right. (Click to enlarge.) You are probably familiar with a stacked bar chart in which the cumulative amount of some quantity is displayed by stacking the contributions of several groups. A canonical example is

Read More

Learn SAS | Programming Tips

How to generate random numbers in SAS

Rick WicklinJanuary 29, 2018 0

How to use the new random-number generators in SAS

What is a random number generator? What are the random-number generators in SAS, and how can you use them to generate random numbers from probability distributions? In SAS 9.4M5, you can use the STREAMINIT function to select from eight random-number generators (RNGs), including five new RNGs. After choosing an RNG,

Read More

Programming Tips

Rick WicklinJanuary 24, 2018 0

Use lists to pass parameters to SAS/IML functions

A popular way to use lists in the SAS/IML language is to pack together several related matrices into a single data structure that can be passed to a function. Imagine that you have written an algorithm that requires a dozen different parameters. Historically, you would have to pass those parameters

Read More

Programming Tips

Rick WicklinJanuary 22, 2018 0

Create lists by using a natural syntax in SAS/IML

SAS/IML 14.3 (SAS 9.4M5) introduced a new syntax for creating lists and for assigning and extracting item in a list. Lists (introduced in SAS/IML 14.2) are data structures that are convenient for holding heterogeneous data. A single list can hold character matrices, numeric matrices, scalar values, and other lists, as

Read More

Data Visualization

Cost of vaginal delivery and C-section for US states (2015-2016)

Rick WicklinJanuary 17, 2018 0

How much does it cost to give birth in the US?

Money magazine (Jan/Feb 2018) contains an article about how much it costs to give birth in the US. The costs, which are based on insurance data, include prenatal care and hospital delivery but exclude infant care. The data are compiled for each state (including Washington, DC) and by type of

Read More

Analytics | Programming Tips

Rick WicklinJanuary 15, 2018 0

Data unavailable? Use the "eyeball distribution" to simulate

Last week I got the following message: Dear Rick: How can I create a normal distribution within a specified range (min and max)? I need to simulate a normal distribution that fits within a specified range. I realize that a normal distribution is by definition infinite... Are there any alternatives,

Read More

Learn SAS | Programming Tips

Rick WicklinJanuary 10, 2018 0

10 posts from 2017 that deserve a second look

Last week I wrote about the 10 most popular articles from The DO Loop in 2017. My most popular articles tend to be about elementary statistics or SAS programming tips. Less popular are the articles about advanced statistical and programming techniques. However, these technical articles fill an important niche. Not

Read More

Data Visualization | Learn SAS | Programming Tips

Rick WicklinJanuary 8, 2018 0

Label multiple regression lines in SAS

A SAS programmer asked how to label multiple regression lines that are overlaid on a single scatter plot. Specifically, he asked to label the curves that are produced by using the REG statement with the GROUP= option in PROC SGPLOT. He wanted the labels to be the slope and intercept

Read More

Learn SAS | Programming Tips

Rick WicklinJanuary 3, 2018 0

The top 10 posts from The DO Loop in 2017

I wrote more than 100 posts for The DO Loop blog in 2017. The most popular articles were about SAS programming tips, statistical data analysis, and simulation and bootstrap methods. Here are the most popular articles from 2017 in each category. General SAS programming techniques INTCK and INTNX: Do you

Read More

Analytics | Data Visualization | Learn SAS

Rick WicklinDecember 20, 2017 0

How to create a sliced fit plot in SAS

I previously showed an easy way to visualize a regression model that has several continuous explanatory variables: use the SLICEFIT option in the EFFECTPLOT statement in SAS to create a sliced fit plot. The EFFECTPLOT statement is directly supported by the syntax of the GENMOD, LOGISTIC, and ORTHOREG procedures in

Read More

Analytics | Data Visualization | Learn SAS

Visualize multivariate regression model by slicing the continuous variables. Graph created by using the EFFECTPLOT SLICEFIT statement in SAS.

Rick WicklinDecember 18, 2017 0

Visualize multivariate regression models by slicing continuous variables

Slice, slice, baby! You've got to slice, slice, baby! When you fit a regression model that has multiple explanatory variables, it is a challenge to effectively visualize the predicted values. This article describes how to visualize the regression model by slicing the explanatory variables. In SAS, you can use the

Read More

Learn SAS | Programming Tips

Rick WicklinDecember 13, 2017 0

How to get the current TITLE in SAS

The SAS language is large. Even after 20+ years of using SAS, there are many features that I have never used. Recently it became necessary for me to learn about DICTIONARY tables in PROC SQL (and the associated SASHELP views) because I needed to programmatically obtain the text for the

Read More

Data Visualization | Learn SAS

Self-similar Christmas tree created in SAS

Rick WicklinDecember 11, 2017 0

A self-similar Christmas tree

Happy holidays to all my readers! My greeting-card to you is an image of a self-similar Christmas tree. The image (click to enlarge) was created in SAS by using two features that I blog about regularly: matrix computations and ODS statistical graphics. Self-similarity in Kronecker products I have previously shown

Read More

Analytics

Bias in regression for mean-imputed explanatory variables

Rick WicklinDecember 6, 2017 0

3 problems with mean imputation

In a previous article, I showed how to use SAS to perform mean imputation. However, there are three problems with using mean-imputed variables in statistical analyses: Mean imputation reduces the variance of the imputed variables. Mean imputation shrinks standard errors, which invalidates most hypothesis tests and the calculation of confidence

Read More

Programming Tips

Rick WicklinDecember 4, 2017 0

Mean imputation in SAS

Imputing missing data is the act of replacing missing data by nonmissing values. Mean imputation replaces missing data in a numerical variable by the mean value of the nonmissing values. This article shows how to perform mean imputation in SAS. It also presents three statistical drawbacks of mean imputation. How

Read More

Data Visualization | Programming Tips

Rick WicklinNovember 29, 2017 0

Visualize patterns of missing values

Missing values present challenges for the statistical analyst and data scientist. Many modeling techniques (such as regression) exclude observations that contain missing values, which can reduce the sample size and reduce the power of a statistical analysis. Before you try to deal with missing values in an analysis (for example,

Read More

Programming Tips

Histogram of data overlaid with a beta density curve, fitted by maximum likelihood estimation

Rick WicklinNovember 27, 2017 0

The method of moments: A smart way to choose initial parameters for MLE

When you run an optimization, it is often not clear how to provide the optimization algorithm with an initial guess for the parameters. A good guess converges quickly to the optimal solution whereas a bad guess might diverge or require many iterations to converge. Many people use a default value

Read More

Programming Tips

Beta-binomial cumulative distribution

Rick WicklinNovember 22, 2017 0

Compute the CDF and quantiles of discrete distributions

A statistical programmer read my article about the beta-binomial distribution and wanted to know how to compute the cumulative distribution (CDF) and the quantile function for this distribution. In general, if you know the PDF for a discrete distribution, you can also compute the CDF and quantile functions. This article

Read More

Analytics | Programming Tips

Beta-binomial distribution and expected values in SAS

Rick WicklinNovember 20, 2017 0

Simulate data from the beta-binomial distribution in SAS

This article shows how to simulate beta-binomial data in SAS and how to compute the density function (PDF). The beta-binomial distribution is a discrete compound distribution. The "binomial" part of the name means that the discrete random variable X follows a binomial distribution with parameters N (number of trials) and

Read More

Learn SAS | Programming Tips

Rick WicklinNovember 15, 2017 0

Catch run-time errors in SAS/IML programs

Did you know that a SAS/IML function can recover from a run-time error? You can specify how to handle run-time errors by using a programming technique that is similar to the modern "try-catch" technique, although the SAS/IML technique is an older implementation. Preventing errors versus handling errors In general, SAS/IML

Read More

Programming Tips

The PAUSE statement as a debugging tool in SAS/IML Studio

Rick WicklinNovember 13, 2017 0

A tip for debugging SAS/IML modules: The PAUSE statement

Debugging is the bane of every programmer. SAS supports a DATA step debugger, but that debugger can't be used for debugging SAS/IML programs. In lieu of a formal debugger, many SAS/IML programmers resort to inserting multiple PRINT statements into a function definition. However, there is an easier way to query

Read More

Learn SAS | Programming Tips

Rick WicklinNovember 8, 2017 0

How to format rows of a table in SAS

A SAS programmer wanted to display a table in which the rows have different formats. An example is shown below. The programmer wanted columns that represent statistics and rows that represent variables. She wanted to display formats (such as DOLLAR) for some variables—but only for certain statistics. For example, the

Read More

Programming Tips

Rick WicklinNovember 6, 2017 0

What is a factoid in SAS?

Have you ever seen the "Fit Summary" table from PROC LOESS, as shown to the right? Or maybe you've seen the "Model Information" table that is displayed by some SAS analytical procedures? These tables provide brief interesting facts about a statistical procedure, hence they are called factoids. In SAS, a

Read More

Programming Tips

Rick WicklinNovember 1, 2017 0

Evaluate a function by using the function name in SAS/IML

A SAS/IML programmer asked whether you can pass the name of a function as an argument to a SAS/IML module and have the module call the function that is passed in. The answer is "yes." The basic idea is to create a string that represents the function call and then

Read More

Data Visualization | Programming Tips

Rick WicklinOctober 30, 2017 0

A SAS programming technique to modify ODS templates

This article demonstrates a SAS programming technique that I call Kuhfeld's template modification technique. The technique enables you to dynamically modify an ODS template and immediately call the modified template to produce a new graph or table. By following the five steps in this article, you can implement the technique

Read More

Analytics

Principal component regression in SAS: Loadings plot

Rick WicklinOctober 25, 2017 0

Should you use principal component regression?

This article describes the advantages and disadvantages of principal component regression (PCR). This article also presents alternative techniques to PCR. In a previous article, I showed how to compute a principal component regression in SAS. Recall that principal component regression is a technique for handling near collinearities among the regression

Read More

Analytics | Learn SAS

Principal component regression in SAS: Loadings plot

Rick WicklinOctober 23, 2017 0

Principal component regression in SAS

A common question on discussion forums is how to compute a principal component regression in SAS. One reason people give for wanting to run a principal component regression is that the explanatory variables in the model are highly correlated which each other, a condition known as multicollinearity. Although principal component

Read More

Analytics | Learn SAS

Rick WicklinOctober 18, 2017 0

The diffogram and other graphs for multiple comparisons of means

In a previous article, I discussed the lines plot for multiple comparisons of means. Another graph that is frequently used for multiple comparisons is the diffogram, which indicates whether the pairwise differences between means of groups are statistically significant. This article discusses how to interpret a diffogram. Two related plots

Read More

Previous 1 … 21 22 23 24 25 … 53 Next