Blogs

Blogs

Author

Rick Wicklin RSS
Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Learn SAS | Programming Tips

Rick WicklinMay 29, 2024 0

Find the label of a variable in SAS

Sometimes labels for variables get "dropped" during data preparation and cleaning. One example is when data are transposed from "wide form" to "long form." For example, suppose a data set has three variables, X, Y, and Z, each with labels. If you transpose the data to long form, the new

Read More

Data Visualization | Programming Tips

Rick WicklinMay 22, 2024 0

Create filled density plots in SAS

A SAS programmer wanted to visualize density estimate for some univariate data. The data had several groups, so he wanted to create a panel of density estimate, which you can easily do by using PROC SGPANEL in SAS. However, the programmer's boss wanted to see filled density estimates, such as

Read More

Analytics | Learn SAS

Rick WicklinMay 20, 2024 0

On the correctness of a discrete simulation

After writing a program that simulates data, it is important to check that the statistical properties of the simulated (synthetic) data match the properties of the model. As a first step, you can generate a large random sample from the model distribution and compare the sample statistics to the expected

Read More

Learn SAS | Programming Tips

Rick WicklinMay 15, 2024 0

Rank, order, and sorting

A SAS programmer was trying to implement an algorithm in PROC IML in SAS based on some R code he had seen on the internet. The R code used the rank() and order() functions. This led the programmer to ask, "What is the different between the rank and the order?

Read More

Analytics | Programming Tips

Rick WicklinMay 13, 2024 0

The distribution of p-values under the null hypothesis

A SAS statistical programmer recently asked a theoretical question about statistics. "I've read that 'p-values are uniformly distributed under the null hypothesis,'" he began, "but what does that mean in practice? Is it important?" I think data simulation is a great way to discuss the conditions for which p-values are

Read More

Learn SAS | Programming Tips

Rick WicklinMay 8, 2024 0

Dice and the correctness of a simulation

At a recent conference in Las Vegas, a presenter simulated the sum of two dice and used it to simulate the game of craps. I write a lot of simulations, so I'd like to discuss two related topics: How to simulate the sum of two dice in SAS. This is

Read More

Data Visualization | Learn SAS | Programming Tips

Rick WicklinMay 6, 2024 0

Visualize patterns of missing values

Years ago, I wrote an article that showed how to visualize patterns of missing data. During a recent data visualization talk, I discussed the program, which used a small number of SAS IML statements. An audience member asked whether it is possible to construct the same visualization by using only

Read More

Analytics | Learn SAS

Rick WicklinMay 1, 2024 0

Estimate a proportion and a confidence interval in SAS

A SAS programmer wanted to estimate a proportion and a confidence interval (CI), but didn't know which SAS procedure to call. He knows a formula for the CI from an elementary statistics textbook. If x is the observed count of events in a random sample of size n, then the

Read More

Analytics | Programming Tips

Rick WicklinApril 29, 2024 0

Bimodal and unimodal beta distributions

In a recent article, I graphed the PDF of a few Beta distributions that had a variety of skewness and kurtosis values. I thought that I had chosen the parameter values to represent a wide variety of Beta shapes. However, I was surprised to see that the distributions were all

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinApril 22, 2024 0

Use the moment-ratio diagram to visualize the sampling distribution of skewness and kurtosis

The moment-ratio diagram is a tool that is useful when choosing a distribution that models a sample of univariate data. As I show in my book (Simulating Data with SAS, Wicklin, 2013), you first plot the skewness and kurtosis of the sample on the moment-ratio diagram to see what common

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinApril 15, 2024 0

Distributions with specified skewness and kurtosis

A SAS programmer wanted to simulate samples from a family of Beta(a,b) distributions for a simulation study. (Recall that a Beta random variable is bounded with values in the range [0,1].) She wanted to choose the parameters such that the skewness and kurtosis of the distributions varied over range of

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinApril 8, 2024 0

Improve the Federal Reserve's dot plot

A dot plot is a standard statistical graphic that displays a statistic (often a mean) and the uncertainty of the statistic for one or more groups. Statisticians and data scientists use it in the analysis of group data. In late 2023, I started noticing headlines about "dot plots" in the

Read More

Data Visualization | Programming Tips

Rick WicklinApril 1, 2024 0

Add a second axis to a SAS graph

Recently, I saw a scatter plot that displayed the ticks, values, and labels for a vertical axis on the right side of a graph. In the SGPLOT procedure in SAS, you can use the Y2AXIS option to move an axis on the right side of a graph. Similarly, you can

Read More

Analytics | Learn SAS

Rick WicklinMarch 27, 2024 0

The likelihood ratio test for linear regression in SAS

A recent article describes how to estimate coefficients in a simple linear regression model by using maximum likelihood estimation (MLE). One of the nice properties of an MLE formulation is that you can compare a large model with a nested submodel in a natural way. For example, if you can

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinMarch 20, 2024 0

Maximum likelihood estimates for linear regression

A statistical analyst used the GENMOD procedure in SAS to fit a linear regression model. He noticed that the table of parameter estimates has an extra row (labeled "Scale") that is not a regression coefficient. The "scale parameter" is not part of the parameter estimates table produced by PROC REG

Read More

Learn SAS | Programming Tips

Rick WicklinMarch 11, 2024 0

Pizza pi

Happy Pi Day! Every year on March 14th (written 3/14 in the US), people in the mathematical sciences celebrate all things pi-related because 3.14 is the three-decimal approximation to π ≈ 3.14159265358979.... Pi is a mathematical constant defined as the ratio of a circle's circumference (C) to its diameter (D).

Read More

Learn SAS | Programming Tips

Rick WicklinMarch 6, 2024 0

A generalized Number-Word Game

I recently wrote about the Number-Word Game, which is an iterative algorithm that generates a sequence of natural numbers by using the lengths of the words for the numbers. In English, the words are "one", "two", "three", and so on. You can play the Number-Word Game in any alphabetic language

Read More

Learn SAS | Programming Tips

Rick WicklinMarch 4, 2024 0

The Number-Word Game

Have you heard about the Number-Word Game? This is a simple game that has the following rules: Start with any positive integer. Write down the English word for the integer. Count the number of letters in the word. This gives a new positive integer. Go to (2). Repeat until a

Read More

Data Visualization | Learn SAS

Rick WicklinFebruary 28, 2024 0

Using colors to visualize groups in a bar chart in SAS

I sometimes see analysts overuse colors in statistical graphics. My rule of thumb is that you do not need to use color to represent a variable that is already represented in a graph. For example, it is redundant to use a continuous color ramp to represent the lengths of bars

Read More

Analytics | Learn SAS

Rick WicklinFebruary 26, 2024 0

On using flexible distributions to fit data

With four parameters I can fit an elephant. With five I can make his trunk wiggle. — John von Neumann Ever since the dawn of statistics, researchers have searched for the Holy Grail of statistical modeling. Namely, a flexible distribution that can model any continuous univariate data. As the quote

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinFebruary 21, 2024 0

On using the range to estimate the variability of small samples

In statistical quality control, practitioners often estimate the variability of products that are being produced in a manufacturing plant. It is important to estimate the variability as soon as possible, which means trying to obtain an estimate from a small sample. Samples of size five or less are not uncommon

Read More

Learn SAS | Programming Tips

Rick WicklinFebruary 19, 2024 0

The linear distribution on an interval

In a recent Monte Carlo project, I needed to simulate numbers on an interval by using a continuous linear probability density function (PDF). An example is shown to the right. In this example, the linear density function is decreasing on the interval, but the function could also be constant or

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinFebruary 12, 2024 0

An exact formula for the sampling distribution of the correlation coefficient

I read a journal article in which a researcher used a formula for the probability density function (PDF) of the sample correlation coefficient. The formula was rather complicated, and presented with no citation, so I was curious to learn more. I found the distribution for the correlation coefficient in the

Read More

Learn SAS

Rick WicklinFebruary 7, 2024 0

The elliptical heart

Some hearts are famous. For example, there is the "Heart of Gold" (Neil Young), the "Heart of Glass" (Blondie), and the Heart of Darkness (Joseph Conrad). But have you heard of the "Heart of Ellipses"? No? Well, in 2023, Ted Conway published an amusingly titled article, "Total Ellipse of the

Read More

Analytics | Learn SAS

Rick WicklinFebruary 5, 2024 0

Peeling a convex hull

This article looks at a geometric method for estimating the center of a multivariate point cloud. The method is known as convex-hull peeling. In two-dimensions, you can perform convex-hull peeling in SAS 9 by using the CVEXHULL function in SAS IML software. For higher dimensions, you can use the CONVEXHULL

Read More

Learn SAS | Programming Tips

Rick WicklinJanuary 31, 2024 0

The name of the variable that contains the largest value in each row

A SAS programmer wanted to find the name of the variable for each row that contains the largest value. This task is useful for wide data sets in which each observation has several variables that are measured on the same scale. For example, each observation in the data might represent

Read More

Analytics

Rick WicklinJanuary 29, 2024 0

The geometry of Jacobi's method

A colleague remarked that my recent article about using Jacobi's iterative method for solving a linear system of equations "seems like magic." Specifically, it seems like magic that you can solve a certain class of linear systems by using only matrix multiplication. For any initial guess, the iteration converges to

Read More

Learn SAS | Programming Tips

Rick WicklinJanuary 24, 2024 0

Implement Jacobi's method in SAS

In a first course in numerical analysis, students often encounter a simple iterative method for solving a linear system of equations, known as Jacobi's method (or Jacobi's iterative method). Although Jacobi's method is not used much in practice, it is introduced because it is easy to explain, easy to implement,

Read More

Analytics | Learn SAS

Rick WicklinJanuary 22, 2024 0

Angles vs slopes: The statistics of steepness

There are two popular ways to express the steepness of a line or ray. The most-often used mathematical definition is from high-school math where the slope is defined as "rise over run." A second way is to report the angle of inclination to the horizontal, as introduced in basic trigonometry.

Read More

Learn SAS | Programming Tips

Rick WicklinJanuary 15, 2024 0

Simulate correlated continuous and discrete variables

Statistical software provides methods to simulate independent random variates from continuous and discrete distributions. For example, in the SAS DATA step, you can use the RAND function to simulate variates from continuous distributions (such as the normal or lognormal distributions) or from discrete distributions (such as the Bernoulli or Poisson).

Read More

Previous 1 2 3 4 … 52 Next