Blogs

Blogs

Author

Rick Wicklin

Rick Wicklin RSS
Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Rick WicklinApril 13, 2015 0

DO loop = 1 TO 600;

Today is my 600th blog post for The DO Loop. I have written about many topics that are related to statistical programming, math, statistics, simulation, numerical analysis, matrix computations, and more. The right sidebar of my blog contains a tag cloud that links to many topics. What topics do you,

Read More

Rick WicklinApril 8, 2015 0

Compute the rank of a matrix in SAS

A common question from statistical programmers is how to compute the rank of a matrix in SAS. Recall that the rank of a matrix is defined as the number of linearly independent columns in the matrix. (Equivalently, the number of linearly independent rows.) This article describes how to compute the

Read More

Rick WicklinApril 6, 2015 0

Let's talk at SAS Global Forum 2015

The 2015 SAS Global Forum is in Dallas, TX, and I'll be there. There are many talks to see and people to meet, so thank goodness for the agenda builder, which enables you to create a schedule in advance. I always enjoy talking with SAS customers about statistics, simulations, matrix

Read More

Rick WicklinApril 1, 2015 0

Simulate the Monty Hall Problem in SAS

The Monty Hall Problem is one of the most famous problems in elementary probability. It is famous because the correct solution is counter-intuitive and because it caused an uproar when it appeared in the "Ask Marilyn" column in Parade magazine in 1990. Discussing the problem has been known to create

Read More

Rick WicklinMarch 30, 2015 0

Visualizing the causes of airline crashes

There has been a spate of recent high-profile airline crashes (Malaysia Airlines, TransAsia Airways, Germanwings,...) so I was surprised when I saw a time series plot of the number of airline crashes by year, which indicates that the annual number of airline crashes has been decreasing since 1993. The data

Read More

Rick WicklinMarch 25, 2015 0

On the number of permutations supported in SAS software

There's "big," and then there is "factorial big." If you have k items, the number of permutations is "k factorial," which is written as k!. The factorial function gets big fast. For example, the value of k! for several values of k is shown in the following table. You can

Read More

Advanced Analytics

Rick WicklinMarch 23, 2015 0

Vectors that have a fractional number of elements

The title of this article makes no sense. How can the number of elements (in fact, the number of anything!) not be a whole number? In fact, it can't. However, the title refers to the fact that you might compute a quantity that ought to be an integer, but is

Read More

Rick WicklinMarch 18, 2015 0

Finding observations that match a target value

Imagine that you have one million rows of numerical data and you want to determine if a particular "target" value occurs. How might you find where the value occurs? For univariate data, this is an easy problem. In the SAS DATA step you can use a WHERE clause or a

Read More

Rick WicklinMarch 16, 2015 0

How to pass parameters to a SAS program

This article show how to run a SAS program in batch mode and send parameters into the program by specifying the parameters when you run SAS from a command line interface. This technique has many uses, one of which is to split a long-running SAS computation into a series of

Read More

Rick WicklinMarch 12, 2015 0

Analyzing the first 10 million digits of pi: Randomness within structure

Saturday, March 14, 2015, is Pi Day, and this year is a super-special Pi Day! This is your once-in-a-lifetime chance to celebrate the first 10 digits of pi (π) by doing something special on 3/14/15 at 9:26:53. Apologies to my European friends, but Pi Day requires that you represent dates

Read More

Advanced Analytics

Rick WicklinMarch 11, 2015 0

Matrix multiplication with missing values in SAS

Sometimes I get contacted by SAS/IML programmers who discover that the SAS/IML language does not provide built-in support for multiplication of matrices that have missing values. (SAS/IML does support elementwise operations with missing values.) I usually respond by asking what they are trying to accomplish, because mathematically matrix multiplication with

Read More

Learn SAS

Rick WicklinMarch 9, 2015 0

Writing data in chunks: Does the chunk size matter?

I often blog about the usefulness of vectorization in the SAS/IML language. A one-sentence summary of vectorization is "execute a small number of statements that each analyze a lot of data." In general, for matrix languages (SAS/IML, MATLAB, R, ...) vectorization is more efficient than the alternative, which is to

Read More

Rick WicklinMarch 6, 2015 0

Create a custom PDF and CDF in SAS

In my previous post, I showed how to approximate a cumulative density function (CDF) by evaluating only the probability density function. The technique uses the trapezoidal rule of integration to approximate the CDF from the PDF. For common probability distributions, you can use the CDF function in Base SAS to

Read More

Rick WicklinMarch 4, 2015 0

An easy way to approximate a cumulative distribution function

Evaluating a cumulative distribution function (CDF) can be an expensive operation. Each time you evaluate the CDF for a continuous probability distribution, the software has to perform a numerical integration. (Recall that the CDF at a point x is the integral under the probability density function (PDF) where x is

Read More

Learn SAS

Rick WicklinMarch 2, 2015 0

Avoid loops, avoid the APPLY function, vectorize!

Last week I received a message from SAS Technical Support saying that a customer's IML program was running slowly. Could I look at it to see whether it could be improved? What I discovered is a good reminder about the importance of vectorizing user-defined modules. The program in this blog

Read More

Rick WicklinFebruary 27, 2015 0

Plotting multiple time series in SAS/IML (Wide to Long, Part 2)

I recently wrote about how to overlay multiple curves on a single graph by reshaping wide data (with many variables) into long data (with a grouping variable). The implementation used PROC TRANSPOSE, which is a procedure in Base SAS. When you program in the SAS/IML language, you might encounter data

Read More

Rick WicklinFebruary 25, 2015 0

Plotting multiple series: Transforming data from wide to long

Data. To a statistician, data are the observed values. To a SAS programmer, analyzing data requires knowledge of the values and how the data are arranged in a data set. Sometimes the data are in a "wide form" in which there are many variables. However, to perform a certain analysis

Read More

Advanced Analytics

Rick WicklinFebruary 23, 2015 0

Complete cases: How to perform listwise deletion in SAS

SAS procedures usually handle missing values automatically. Univariate procedures such as PROC MEANS automatically delete missing values when computing basic descriptive statistics. Many multivariate procedures such as PROC REG delete an entire observation if any variable in the analysis has a missing value. This is called listwise deletion or using

Read More

Rick WicklinFebruary 18, 2015 0

Monitor the progress of a long-running SAS/IML program

When you have a long-running SAS/IML program, it is sometimes useful to be able to monitor the progress of the program. For example, suppose you need to computing statistics for 1,000 different data sets and each computation takes between 5 and 30 seconds. You might want to output a message

Read More

Rick WicklinFebruary 16, 2015 0

Friends don't let friends concatenate results inside a loop

Friends have to look out for each other. Sometimes this can be slightly embarrassing. At lunch you might need to tell a friend that he has some tomato sauce on his chin. Or that she has a little spinach stuck between her teeth. Or you might need to tell your

Read More

Rick WicklinFebruary 11, 2015 0

Binary heart in SAS

The xkcd comic often makes me think and laugh. The comic features physics, math, and statistics among its topics. Many years ago, the comic showed a "binary heart": a grid of binary (0/1) numbers with the certain numbers colored red so that they formed a heart. Some years later, I

Read More

Advanced Analytics

Rick WicklinFebruary 9, 2015 0

Create an array of matrices in SAS

The SAS DATA step supports multidimensional arrays. However, matrices in SAS/IML are like mathematical matrices: they are always two dimensional. In simulation studies you might need to generate and store thousands of matrices for a later statistical analysis of their properties. How can you accomplish that unless you can create

Read More

Advanced Analytics

Rick WicklinFebruary 4, 2015 0

Specify the order of variables at run time in SAS

In SAS, the order of variables in a data set is usually unimportant. However, occasionally SAS programmers need to reorder the variables in order to make a special graph or to simplify a computation. Reordering variables in the DATA step is slightly tricky. There are Knowledge Base articles about how

Read More

Advanced Analytics | Learn SAS

Rick WicklinFebruary 2, 2015 0

Detect empty parameters that are passed to a SAS/IML module

A SAS/IML programmer asked a question on a discussion forum, which I paraphrase below: I've written a SAS/IML function that takes several arguments. Some of the arguments have default values. When the module is called, I want to compute some quantity, but I only want to compute it for the

Read More

Learn SAS

Rick WicklinJanuary 28, 2015 0

The relationship between skewness and kurtosis

In my book Simulating Data with SAS, I discuss a relationship between the skewness and kurtosis of probability distributions that might not be familiar to some statistical programmers. Namely, the skewness and kurtosis of a probability distribution are not independent. If κ is the full kurtosis of a distribution and

Read More

Learn SAS

Rick WicklinJanuary 26, 2015 0

IF-THEN logic with matrix expressions

In the SAS DATA step, all variables are scalar quantities. Consequently, an IF-THEN/ELSE statement that evaluates a logical expression is unambiguous. For example, the following DATA step statements print "c=5 is TRUE" to the log if the variable c is equal to 5: if c=5 then put "c=5 is TRUE";

Read More

Learn SAS

Rick WicklinJanuary 22, 2015 0

What is an empty matrix?

At the beginning of my book Statistical Programming with SAS/IML Software I give the following programming tip (p. 25): Do not confuse an empty matrix with a matrix that contains missing values or with a zero matrix. An empty matrix has no rows and no columns. A matrix that contains

Read More

Learn SAS

Rick WicklinJanuary 20, 2015 0

Five tips from Simulating Data with SAS

Data simulation is a fundamental technique in statistical programming and research. My book Simulating Data with SAS is an accessible how-to book that describes the most useful algorithms and the best programming techniques for efficient data simulation in SAS. Here are five lessons you can learn by reading it: Learn strategies

Read More

Learn SAS

Rick WicklinJanuary 20, 2015 0

Finding matrix elements that satisfy a logical expression

A common task in SAS/IML programming is finding elements of a SAS/IML matrix that satisfy a logical expression. For example, you might need to know which matrix elements are missing, are negative, or are divisible by 2. In the DATA step, you can use the WHERE clause to subset data.

Read More

Rick WicklinJanuary 14, 2015 0

Calling a global statement inside a loop

The other day I was creating some histograms inside a loop in PROC IML. It was difficult for me to determine which histogram was associated with which value of the looping variable. "No problem," I said. "I'll just use a TITLE statement inside the loop so that each histogram has

Read More

Previous 1 … 31 32 33 34 35 … 53 Next