Permute elements within each row of a matrix

Bootstrap methods and permutation tests are popular and powerful nonparametric methods for testing hypotheses and approximating the sampling distribution of a statistic. I have described a SAS/IML implementation of a bootstrap permutation test for matched pairs of data (an alternative to a matched-pair t test) in my paper "Modern Data […]
Post a Comment

Tips for concatenating strings in SAS/IML

Last week, as part of an article on how spammers generate comments for blogs, I showed how to generate random messages by using the CATX function in the DATA step. In that example, the strings were scalar quantities, but you can also concatenate vectors of strings in the SAS/IML language. […]
Post a Comment

How much RAM do I need to store that matrix?

Dear Rick, I am trying to create a numerical matrix with 100,000 rows and columns in PROC IML. I get the following error: (execution) Unable to allocate sufficient memory. Can IML allocate a matrix of this size? What is wrong? Several times a month I see a variation of this […]
Post a Comment

Techniques for scoring a regression model in SAS

My previous post described how to use the "missing response trick" to score a regression model. As I said in that article, there are other ways to score a regression model. This article describes using the SCORE procedure, a SCORE statement, the relatively new PLM procedure, and the CODE statement. […]
Post a Comment

The missing value trick for scoring a regression model

A fundamental operation in statistical data analysis is to fit a statistical regression model on one set of data and then evaluate the model on another set of data. The act of evaluating the model on the second set of data is called scoring. One of first "tricks" that I […]
Post a Comment

How to vectorize time series computations

Vector languages such as SAS/IML, MATLAB, and R are powerful because they enable you to use high-level matrix operations (matrix multiplication, dot products, etc) rather than loops that perform scalar operations. In general, vectorized programs are more efficient (and therefore run faster) than programs that contain loops. For an example […]
Post a Comment

Write a matrix in the "long form"

If you write an n x p matrix from PROC IML to a SAS data set, you'll get a data set with n rows and p columns. For some applications, it is more convenient to write the matrix in a "long format" with np observations and three columns. The first […]
Post a Comment

Square root transformations: How to handle negative data values?

I was looking at someone else's SAS/IML program when I saw this line of code: y = sqrt(x<>0); The statement uses the element maximum operator (<>) in the SAS/IML language to make sure that negative value are never passed to the square root function. This little trick is a real […]
Post a Comment

Output percentiles of multiple variables in a tabular format

A challenge for statistical programmers is getting data into the right form for analysis. For graphing or analyzing data, sometimes the "wide format" (each subject is represented by one row and many variables) is required, but other times the "long format" (observations for each subject span multiple rows) is more […]
Post a Comment

How to tell whether a sequence of heads and tails is random

While walking in the woods, a statistician named Goldilocks wanders into a cottage and discovers three bears. The bears, being hungry, threaten to eat the young lady, but Goldilocks begs them to give her a chance to win her freedom. The bears agree. While Mama Bear and Papa Bear block […]
Post a Comment