# Author Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of PROC IML and SAS/IML Studio. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

0
Fast simulation of multivariate normal data with an AR(1) correlation structure

It is sometimes necessary for researchers to simulate data with thousands of variables. It is easy to simulate thousands of uncorrelated variables, but more difficult to simulate thousands of correlated variables. For that, you can generate a correlation matrix that has special properties, such as a Toeplitz matrix or a

Programming Tips
0
Chi-square tests for proportions in one-way tables

Programmers on a SAS discussion forum recently asked about the chi-square test for proportions as implemented in PROC FREQ in SAS. One person asked the basic question, "how do I test the null hypothesis that the observed proportions are equal to a set of known proportions?" Another person said that

Programming Tips
0
Radial basis functions and Gaussian kernels in SAS

A radial basis function is a scalar function that depends on the distance to some point, called the center point, c. One popular radial basis function is the Gaussian kernel φ(x; c) = exp(-||x – c||2 / (2 σ2)), which uses the squared distance from a vector x to the

Programming Tips
0
How many perfect riffle shuffles are required to restore a deck to its initial order?

Last week I compared the overhand shuffle to the riffle shuffle. I used random operations to simulate both kinds of shuffles and then compared how well they mix cards. The article caused one my colleague and fellow blogger, Rob Pratt, to ask if I was familiar with a bit of

0
Shuffling smackdown: Overhand shuffle versus riffle shuffle

Every day I’m shufflin'. Shufflin', shufflin'.       -- "Party Rock Anthem," LMFAO The most popular way to mix a deck of cards is the riffle shuffle, which separates the deck into two pieces and interleaves the cards from each piece. Besides being popular with card players, the riffle shuffle is

Programming Tips
0
Linearly spaced vectors in SAS

The SAS/IML language and the MATLAB language are similar. Both provide a natural syntax for performing high-level computations on vectors and matrices, including basic linear algebra subroutines. Sometimes a SAS programmer will convert an algorithm from MATLAB into SAS/IML. Because the languages are not identical, I am sometimes asked, "what

Analytics
0
Two interfaces for typing text by using a TV remote control

Have you ever tried to type a movie title by using a TV remote control? Both Netflix and Amazon Video provide an interface (a virtual keyboard) that enables you to use the four arrow keys of a standard remote control to type letters. The letters are arranged in a regular

Programming Tips
0
Distances on rectangular grids

Given a rectangular grid with unit spacing, what is the expected distance between two random vertices, where distance is measured in the L1 metric? (Here "random" means "uniformly at random.") I recently needed this answer for some small grids, such as the one to the right, which is a 7 x 6

Programming Tips
0
The continued fraction representation of a rational number

Continued fractions show up in surprising places. They are used in the numerical approximations of certain functions, including the evaluation of the normal cumulative distribution function (normal CDF) for large values of x (El-bolkiny, 1995, p. 75-77) and in approximating the Lambert W function, which has applications in the modeling

Programming Tips
0
Store vectors of different lengths in a matrix

In the SAS/IML language, you can only concatenate vectors that have conforming dimensions. For example, to horizontally concatenate two vectors X and Y, the symbols X and Y must have the same number of rows. If not, the statement Z = X || Y will produce an error: ERROR: Matrices

Analytics
0
Kernel regression in SAS

A SAS programmer recently asked me how to compute a kernel regression in SAS. He had read my blog posts "What is loess regression" and "Loess regression in SAS/IML" and was trying to implement a kernel regression in SAS/IML as part of a larger analysis. This article explains how to

Programming Tips
0
On the assumptions (and misconceptions) of linear regression

A frequent topic on SAS discussion forums is how to check the assumptions of an ordinary least squares linear regression model. Some posts indicate misconceptions about the assumptions of linear regression. In particular, I see incorrect statements such as the following: Help! A histogram of my variables shows that they

0
Standardized regression coefficients

A SAS programmer recently asked how to interpret the "standardized regression coefficients" as computed by the STB option on the MODEL statement in PROC REG and other SAS regression procedures. The SAS documentation for the STB option states, "a standardized regression coefficient is computed by dividing a parameter estimate by

Programming Tips
0
Calculators killed the standard statistical table

Video killed the radio star.... We can't rewind, we've gone too far.       -- The Buggles (1979) "You kids have it easy," my father used to tell me. "When I was a kid, I didn't have all the conveniences you have today." He's right, and I could say the same

0
Optimization with nonlinear constraints in SAS

This article shows how to perform an optimization in SAS when the parameters are restricted by nonlinear constraints. In particular, it solves an optimization problem where the parameters are constrained to lie in the annular region between two circles. The end of the article shows the path of partial solutions

0
A quantile regression analysis of chess ratings by age

My colleague, Robert Allison, recently published an interesting visualization of the relationship between chess ratings and age. His post was inspired by the article "Age vs Elo — Your battle against time," which was published on the chess.com website. ("Elo" is one of the rating systems in chess.) Robert Allison's

0
Plot curves for levels of two categorical variables in SAS

The SGPLOT procedure in SAS makes it easy to create graphs that overlay various groups in the data. Many statements support the GROUP= option, which specifies that the graph should overlay group information. For example, you can create side-by-side bar charts and box plots, and you can overlay multiple scatter

0
How to score and graph a quantile regression model in SAS

This article shows how to score (evaluate) a quantile regression model on new data. SAS supports several procedures for quantile regression, including the QUANTREG, QUANTSELECT, and HPQUANTSELECT procedures. The first two procedures do not support any of the modern methods for scoring regression models, so you must use the "missing

0
Which variables are in the final selected model?

When you use a regression procedure in SAS that supports variable selection (GLMSELECT or QUANTSELECT), did you know that the procedures automatically produce a macro variable that contains the names of the selected variables? This article provides examples and details. A previous article provides an overview of the 'SELECT' procedures

0
Meaningful names for columns of a design matrix

A programmer recently asked a question on a SAS discussion forum about design matrices for categorical variables. He had generated a design matrix by using PROC GLMMOD and wanted to use the design columns in a subsequent procedure. However, the columns were named COL1, COL2, COL3,..., so he couldn't tell

Programming Tips
0
Color cells in a mosaic plot by deviation from independence

Back in SAS 9.3M2 (SAS/STAT 12.1), PROC FREQ introduced mosaic plots to visualize the joint frequencies in a contingency table. By default, the cells in a mosaic plot are colored according to levels of one of the categorical variables in the analysis. However, in 2013 I showed how you can

0
How to use the %BOOT and %BOOTCI macros in SAS

Since the late 1990s, SAS has supplied macros for basic bootstrap and jackknife analyses. This article provides an example that shows how to use the %BOOT and %BOOTCI macros. The %BOOT macro generates a bootstrap distribution and computes basic statistics about the bootstrap distribution, including estimates of bias, standard error,

Programming Tips
0
Balanced bootstrap resampling in SAS

This article shows how to implement balanced bootstrap sampling in SAS. The basic bootstrap samples with replacement from the original data (N observations) to obtain B new samples. This is called "uniform" resampling because each observation has a uniform probability of 1/N of being selected at each step of the

0
Offset regions: Find all points within a specified distance from a polygon

My colleague Robert Allison recently blogged about using the diameter of Texas as a unit of measurement. The largest distance across Texas is about 801 miles, so Robert wanted to find the set of all points such that the distance from the point to Texas is less than or equal

Programming Tips
0
The probability that two random chords of a circle intersect

In a previous article, I showed how to find the intersection (if it exists) between two line segments in the plane. There are some fun problems in probability theory that involve intersections of line segments. One is "What is the probability that two randomly chosen chords of a circle intersect?"

0
The intersection of two line segments

Back in high school, you probably learned to find the intersection of two lines in the plane. The intersection requires solving a system of two linear equations. There are three cases: (1) the lines intersect in a unique point, (2) the lines are parallel and do not intersect, or (3)

0
Compute derivatives for nonparametric regression models

SAS enables you to evaluate a regression model at any location within the range of the data. However, sometimes you might be interested in how the predicted response is increasing or decreasing at specified locations. You can use finite differences to compute the slope (first derivative) of a regression model.

Data Visualization
0
Ranking US presidents

Which president of the United States is ranked the greatest by presidential historians? This article visualizes the results of the 2018 Presidential Greatness Survey, which was created and administered by B. Rottinghaus and J. Vaughn. They analyzed 166 responses from experts in political science who ranked the 44 US presidents

0
Reduced models: A way to choose initial parameters for a mixed model

This article describes how to obtain an initial guess for nonlinear regression models, especially nonlinear mixed models. The technique is to first fit a simpler fixed-effects model by replacing the random effects with their expected values. The parameter estimates for the fixed-effects model are often good initial guesses for the

Analytics
0
Use a grid search to find initial parameter values for regression models in SAS

When you fit nonlinear fixed-effect or mixed models, it is difficult to guess the model parameters that fit the data. Yet, most nonlinear regression procedures (such as PROC NLIN and PROC NLMIXED in SAS) require that you provide a good guess! If your guess is not good, the fitting algorithm,