There are many techniques for generating random variates from a specified probability distribution such as the normal, exponential, or gamma distribution. However, one technique stands out because of its generality and simplicity: the inverse CDF sampling technique. If you know the cumulative distribution function (CDF) of a probability distribution, then
Author
In a previous article I discussed how to bin univariate observations by using the BIN function, which was added to the SAS/IML language in SAS/IML 9.3. You can generalize that example and bin bivariate or multivariate data. Over two years ago I wrote a blog post on 2D binning in
It is often useful to partition observations for a continuous variable into a small number of intervals, called bins. This familiar process occurs every time that you create a histogram, such as the one on the left. In SAS you can create this histogram by calling the UNIVARIATE procedure. Optionally,
Are you still using the old RANUNI, RANNOR, RANBIN, and other "RANXXX" functions to generate random numbers in SAS? If so, here are six reasons why you should switch from these older (1970s) algorithms to the newer (late 1990s) Mersenne-Twister algorithm, which is implemented in the RAND function. The newer
Every programming language has an IF-THEN statement that branches according to whether a Boolean expression is true or false. In SAS, the IF-THEN (or IF-THEN/ELSE) statement evaluates an expression and braches according to whether the expression is nonzero (true) or zero (false). The basic syntax is if numeric-expression then do-computation;
On the Web site for the book Statistical Programming with SAS/IML Software, I provide instructions on how to download the sample data sets and install them so that they can be used from within SAS/IML Studio. When I wrote the book I did not anticipate that SAS users might want
As I wrote in my previous post, a SAS customer noticed that he was getting some duplicate values when he used the RAND function to generate a large number of random uniform values on the interval [0,1]. He wanted to know if this result indicates a bug in the RAND
Tossing dice is a simple and familiar process, yet it can illustrate deep and counterintuitive aspects of random numbers. For example, if you toss four identical six-sided dice, what is the probability that the faces are all distinct, as shown to the left? Many people would guess that the probability
The CLUSTER procedure in SAS/STAT software creates a dendrogram automatically. The black-and-white dendrogram is nice, but plain. A SAS customer wanted to know whether it is possible to add color to the dendrogram to emphasize certain clusters. For example, the plot at the left emphasizes a four-cluster scenario for clustering
How do you count the number of unique rows in a matrix? The simplest algorithm is to sort the data and then iterate down the rows, comparing each row with the previous row. However, this algorithm has two shortcomings: it physically sorts the data (which means that the original locations