# Author Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

0
Compute the Hoeffding D statistic in SAS

There are many statistics that measure whether two continuous random variables are independent or whether they are related to each other in some way. The most well-known statistic is Pearson's correlation, which is a parametric measure of the linear relationship between two variables. A related measure is Spearman's rank correlation,

0
On passing a list to a SAS/IML module

SAS/IML programmers often create and call user-defined modules. Recall that a module is a user-defined subroutine or function. A function returns a value; a subroutine can change one or more of its input arguments. I have written a complete guide to understanding SAS/IML modules, which contains many tips for working

0
Compute bivariate ranks

Ranking is a fundamental concept in statistics. Ranks of univariate data are used by statisticians to estimate statistics such as percentiles (quantiles) and empirical distributions. A more advanced use is to compute various rank-based measures of correlation or association between pairs of variables. For example, ranks are used to compute

0
Compute tied ranks

The ranks of a set of data values are used in many nonparametric statistics and statistical tests. When you request a statistic or nonparametric test in SAS, the procedure will automatically compute the ranks that are needed. However, sometimes it is useful to know how to compute the ranks yourself.

0
Overlay other graphs on a bar chart with PROC SGPLOT

It can be frustrating to receive an error message from statistical software. In the early days of the SAS statistical graphics (SG) procedures, an error message that I dreaded was ERROR: Attempting to overlay incompatible plot or chart types. This error message appears when you attempt to use PROC SGPLOT

0
3 reasons to prefer a horizontal bar chart

Most introductory statistics courses introduce the bar chart as a way to visualize the frequency (counts) for a categorical variable. A vertical bar chart places the categories along the horizontal (X) axis and shows the counts (or percentages) on the vertical (Y) axis. The vertical bar chart is a precursor

0
Double integrals by using Monte Carlo methods

As mentioned in my article about Monte Carlo estimate of (one-dimensional) integrals, one of the advantages of Monte Carlo integration is that you can perform multivariate integrals on complicated regions. This article demonstrates how to use SAS to obtain a Monte Carlo estimate of a double integral over rectangular and

0
Sample size for the Monte Carlo estimate of an integral

A previous article shows how to use Monte Carlo simulation to estimate a one-dimensional integral on a finite interval. A larger random sample will (on average) result in an estimate that is closer to the true value of the integral than a smaller sample. This article shows how you can

0
Estimate an integral by using Monte Carlo simulation

Numerical integration is important in many areas of applied mathematics and statistics. For one-dimensional integrals on the interval (a, b), SAS software provides two important tools for numerical integration: For common univariate probability distributions, you can use the CDF function to integrate the density, thus obtaining the probability that a

0
Identify influential observations in regression models

A previous article discusses how to interpret regression diagnostic plots that are produced by SAS regression procedures such as PROC REG. In that article, two of the plots indicate influential observations and outliers. Intuitively, an observation is influential if its presence changes the parameter estimates for the regression by "more

0
An overview of regression diagnostic plots in SAS

When you fit a regression model, it is useful to check diagnostic plots to assess the quality of the fit. SAS, like most statistical software, makes it easy to generate regression diagnostics plots. Most SAS regression procedures support the PLOTS= option, which you can use to generate a panel of

0
Control the fill and outline colors of scatter plot markers in SAS

This article shows how to use PROC SGPLOT in SAS to create the scatter plot shown to the right. The scatter plot has the following features: The colors of markers are determined by the value of a third variable. The outline of each marker is the same color (such as

Analytics
0
The Farey sequence

Here is an interesting math question: How many reduced fractions in the interval (0, 1) have a denominator less than 100? The question is difficult is because of the word "reduced." If we only care about the total number of fractions in (0,1) whose denominator is less than 100, we

0
The generalized gamma distribution

A SAS customer wanted to compute the cumulative distribution function (CDF) of the generalized gamma distribution. For any continuous distribution, the CDF is the integral of the probability density function (PDF), which usually has an explicit formula. Accordingly, he wanted to compute the CDF by using the QUAD function in

Programming Tips
0
Pi and products

This is my Pi Day post for 2021. Every year on March 14th (written 3/14 in the US), geeky mathematicians and their friends celebrate "all things pi-related" because 3.14 is the three-decimal approximation to pi. Most years I write about lower-case pi (π), which is the ratio of a circle's

0
The conditional distribution of a response variable

I recently learned about a new feature in PROC QUANTREG that was added in SAS/STAT 15.1 (part of SAS 9.4M6). Recall that PROC QUANTREG enables you to perform quantile regression in SAS. (If you are not familiar with quantile regression, see an earlier article that describes quantile regression and provides

0
Convert decimals to fractions in SAS

I have previously shown how you can use the FRACTw. format in SAS to display numbers as fractions. But did you know that you can also use the format to obtain the numerator and denominator of the fraction as numbers in a program? All you need to do is to

0
Create a wind chill chart in SAS

I recently wrote about a simple statistical formula that approximates the wind chill temperature, which is the cumulative effect of air temperature and wind on the human body. The formula uses two independent variables (air temperature and wind speed) to predict the wind chill temperature. This article describes how to

Programming Tips
0
The wind chill chart

In cold and blustery conditions, the weather forecast often includes two temperatures: the actual air temperature and the wind chill temperature. The wind chill temperature conveys the cumulative effect of air temperature and wind on the human body. The goal of the wind-chill scale is to communicate the effect of

0
How to use the #BYVAR and #BYVAL keywords to customize graph titles in SAS

A previous article describes how to use the SGPANEL procedure to visualize subgroups of data. It focuses on using headers to display information about each graph. In the example, the data are time series for the price of several stocks, and the headers include information about whether the stock price

0
Data-driven titles for graphs

Many characteristics of a graph are determined by the underlying data at run time. A familiar example is when you use colors to indicate different groups in the data. If the data have three groups, you see three colors. If the data have four groups, you see four colors. The

0
Generate all quadratic interactions in a regression model

I've previously written about how to generate all pairwise interactions for a regression model in SAS. For a model that contains continuous effects, the easiest way is to use the EFFECT statement in PROC GLMSELECT to generate second-degree "polynomial effects." However, a SAS programmer was running a simulation study and

0
Print the top rows of your SAS data

One of the first things I learned in SAS was how to use PROC PRINT to display parts of a data set. I usually do not need to see all the data, so my favorite way to use PROC PRINT is to use the OBS= data set option to display

Analytics
0
A matrix is singular if its rows are arithmetic sequences

Look at the following matrices. Do you notice anything that these matrices have in common? If you noticed that the rows of each matrix are arithmetic progressions, good for you. For each row, there is a constant difference (also called the "increment") between adjacent elements. For these examples: In the

0
Generate random points on a sphere

In a previous article, I showed how to generate random points uniformly inside a d-dimensional sphere. In that article, I stated the following fact: If Y is drawn from the uncorrelated multivariate normal distribution, then S = Y / ||Y|| has the uniform distribution on the unit sphere. I was

Programming Tips
0
Gaussian random walks and Levy flights

Imagine an animal that is searching for food in a vast environment where food is scarce. If no prey is nearby, the animal's senses (such as smell and sight) are useless. In that case, a reasonable search strategy is a random walk. The animal can choose a random direction, walk/swim/fly

0
The inverse gamma distribution in SAS

The inverse gamma distribution is a continuous probability distribution that is used in Bayesian analysis and in some statistical models. The inverse gamma distribution is closely related to the gamma distribution. For any probability distribution, it is essential to know how to compute four functions: the PDF function, which returns

Programming Tips
0
How to compute the incomplete gamma function in SAS

Years ago, I wrote about how to compute the incomplete beta function in SAS. Recently, a SAS programmer asked about a similar function, called the incomplete gamma function. The incomplete gamma function is a "special function" that arises in applied math, physics, and statistics. You should not confuse the gamma

0
The stationary block bootstrap in SAS

This is the third and last introductory article about how to bootstrap time series in SAS. In the first article, I presented the simple block bootstrap and discussed why bootstrapping a time series is more complicated than for regression models that assume independent errors. Briefly, when you perform residual resampling

0
The DOLIST syntax: Specify a list of numerical values in SAS

Have you ever heard of the DOLIST syntax? You might know the syntax even if you are not familiar with the name. The DOLIST syntax is a way to specify a list of numerical values to an option in a SAS procedure. Applications include: Specify the end points for bins