In a previous post, I discussed how to use the LOC function to eliminate loops over observations. Dale McLerran chimed in to remind me that another way to improve efficiency is to use subscript reduction operators. I ended my previous post by issuing a challenge: can you write an efficient
Tag: Statistical Programming
Have you ever been stuck while trying to solve a scrambled-word puzzle? You stare and stare at the letters, but no word reveals itself? You are stumped. Stymied. I hope you didn't get stumped on the word puzzle I posted as an anniversary present for my wife. She breezed through
In a previous post, I discussed how to generate random permutations of N elements. But what if you want to systematically iterate through a list of ALL permutations of N elements? In the SAS DATA step you can use the ALLPERM subroutine in the SAS DATA step. For example, the
My previous post on creating a random permutation started me thinking about word games. My wife loves to solve the daily Jumble® puzzle that runs in our local paper. The puzzle displays a string of letters like MLYBOS, and you attempt to unscramble the letters to make an ordinary word.
I recently read a paper that described a SAS macro to carry out a permutation test. The permutations were generated by PROC IML. (In fact, an internet search for the terms "SAS/IML" and "permutation test" gives dozens of papers in recent years.) The PROC IML code was not as efficient
The SAS/IML language is a vector language, so statements that operate on a few long vectors run much faster than equivalent statements that involve many scalar quantities. For example, in a previous post, I asserted that the LOC function is much faster than writing a loop, for finding observations that
The SAS/IML run-time library contains hundreds of functions and subroutines that you can call to perform statistical analysis. There are also many functions in Base SAS software that you can call from SAS/IML programs. However, one day you might need to compute some quantity for which there is no prewritten
Today is the birthday of Bernhard Riemann, a German mathematician who made fundamental contributions to the fields of geometry, analysis, and number theory. Riemann is definitely on my list of the greatest mathematicians of all time, and his conjecture about the distribution of prime numbers is one of the great
Missing values are a fact of life. Many statistical analyses, such as regression, exclude observations that contain missing values prior to forming matrix equations that are used in the analysis. This post shows how to find rows of a data matrix that contain missing values and how to remove those
A frequently performed task in data analysis is identifying all the observations in a data set that satisfy certain conditions. For example, you might want to identify all of the female patients in your study or to identify all patients whose systolic blood pressure is greater than 140 mm Hg.
The R You Ready blog posed an interesting problem. Essentially, you have a vector that contains n(n+1)/2 elements, and you want to pack those elements into the upper left triangular portion of a matrix. For example, if your data are proc iml; /** vector v is given: ncol(v) = n(n+1)/2 for
When programmers begin learning a new computer language, the first program they write is often one that prints the text “Hello, World!” Successfully writing a Hello World program assures the programmer that the software is successfully installed and that all necessary features are working: parsers, compilers, linkers, and so on.