In my previous post, I described how to implement an iterated function system (IFS) in the SAS/IML language to draw fractals. I used the famous Barnsley fern example to illustrate the technique. At the end of the article I issued a challenge: can you construct an IFS whose fractal attractor
Author
Fractals. If you grew up in the 1980s or '90s and were interested in math and computers, chances are you played with computer generation of fractals. Who knows how many hours of computer time was spent computing Mandelbrot sets and Julia sets to ever-increasing resolutions? When I was a kid,
In a recent article on efficient simulation from a truncated distribution, I wrote some SAS/IML code that used the LOC function to find and exclude observations that satisfy some criterion. Some readers came up with an alternative algorithm that uses the REMOVE function instead of subscripts. I remarked in a
It seemed like an easy task. A SAS user asked me how to use the SGPLOT procedure to create a bar chart where the vertical axis shows percentages instead of counts. I assumed that there was some simple option that would change the scale of the vertical axis from counts
Frequently someone will post a question to the SAS Support Community that says something like this: I am trying to do [statistical task]and SAS issues an error and reports that my correlation matrix is not positive definite. What is going on and how can I complete [the task]? The statistical
Last week I wrote about using acceptance-rejection algorithms in vector languages to simulate data. The main point I made is that in a vector language it is efficient to generate many more variates than are needed, with the knowledge that a certain proportion will be rejected. In last week's article,
The LOC function is one of the most important functions in the SAS/IML language. The LOC function finds elements of a vector or matrix that satisfy some condition. For example, if you are going to apply a logarithmic transform to data, you can use the LOC function to find all
A few days ago on the SAS/IML Support Community, there was an interesting discussion about how to simulate data from a truncated Poisson distribution. The SAS/IML user wanted to generate values from a Poisson distribution, but discard any zeros that are generated. This kind of simulation is known as an
I was recently asked, "Does SAS support computing inverse hyperbolic trigonometric functions?" I was pretty sure that I had used the inverse hyperbolic trig functions in SAS, so I was surprised when I read the next sentence: "I ask because I saw a Usage Note that says these functions are
The other day I was constructing covariance matrices for simulating data for a mixed model with repeated measurements. I was using the SAS/IML BLOCK function to build up the "R-side" covariance matrix from smaller blocks. The matrix I was constructing was block-diagonal and looked like this: The matrix represents a
I recently encountered a SUGI30 paper by Chuck Kincaid entitled "Guidelines for Selecting the Covariance Structure in Mixed Model Analysis." I think Kincaid does a good job of describing some common covariance structures that are used in mixed models. One of the many uses for SAS/IML is as a language
The determinant of a matrix arises in many statistical computations, such as in estimating parameters that fit a distribution to multivariate data. For example, if you are using a log-likelihood function to fit a multivariate normal distribution, the formula for the log-likelihood involves the expression log(det(Σ)), where Σ is the
I was looking at some SAS documentation when I saw a Base SAS function that I never knew existed. The NWKDOM function returns the date for the nth occurrence of a weekday for the specified month and year. I surely could have used that function last spring when I blogged
There are a lot of useful probability distributions that are not featured in standard statistical textbooks. Some of them have distinctive names. In the past year I have had contact with SAS customers who use the Tweedie distribution, the slash distribution, and the PERT distribution. Often these distributions are used
What's in a name? As Shakespeare's Juliet said, "That which we call a rose / By any other name would smell as sweet." A similar statement holds true for the names of colors in SAS: "Rose" by any other name would look as red! SAS enables you to specify a
Sometimes a graph is more interpretable if you assign specific colors to categories. For example, if you are graphing the number of Olympic medals won by various countries at the 2012 London Olympics, you might want to assign the colors gold, silver, and bronze to represent first-, second-, and third-place
The New York Times has an excellent staff that produces visually interesting graphics for the general public. However, because their graphs need to be understood by all Times readers, the staff sometimes creates a complicated infographic when a simpler statistical graph would show the data in a clearer manner. A
Last week I wrote a SAS/IML program that computes the odds of winning the game of craps. I noted that the program remains valid even if the dice are not fair. For convenience, here is a SAS/IML function that computes the probability of winning at craps, given the probability vector
It is easy to simulate data that is uniformly distributed in the unit cube for any dimension. However, it is less obvious how to generate data in the unit simplex. The simplex is the set of points (x1,x2,...,xd) such that Σi xi = 1 and 0 ≤ xi ≤ 1
Gambling games that use dice, such as the game of "craps," are often used to demonstrate the laws of probability. For two dice, the possible rolls and probability of each roll are usually represented by a matrix. Consequently, the SAS/IML language makes it easy to compute the probabilities of various
John D. Cook posted a story about Hardy, Ramanujan, and Euler and discusses a conjecture in number theory from 1937. Cook says, Euler discovered 635,318,657 = 158^4 + 59^4 = 134^4 + 133^4 and that this was the smallest [integer]known to be the sum of two fourth powers in two
Did you know that you can index into SAS/IML matrices by using unique strings that you assign via the MATTRIB statement? The MATTRIB statement associates various attributes to a matrix. Usually, these attributes are only used for printing, but you can also use the ROWNAME= and COLNAME= attributes to subset
I was recently flipping through Ross' Simulation (2006, 4th Edition) and saw the following exercise: Let N be the minimum number of draws from a uniform distribution [until the sum of the variates]exceeds 1. What is the expected value of N? Write a simulation to estimate the expected value. For
Sometimes it is useful to group observations based on the values of some variable. Common schemes for grouping include binning and using quantiles. In the binning approach, a variable is divided into k equal intervals, called bins, and each observation is assigned to a bin. In this scheme, the size
With the US presidential election looming, all eyes are on the Electoral College. In the presidential election, each state gets as many votes in the Electoral College as it has representatives in both congressional houses. (The District of Columbia also gets three electors.) Because every state has two senators, it
If you use a word three times, it's yours. -Unknown When I was a child, my mother used to encourage me to increase my vocabulary by saying, "If you use a word three times, it's yours for life." I believe that the same saying holds for programming techniques: Use a
This article is an excerpt from my forthcoming book Simulating Data with SAS. Not every matrix with 1 on the diagonal and off-diagonal elements in the range [–1, 1] is a valid correlation matrix. A correlation matrix has a special property known as positive semidefiniteness. All correlation matrices are positive
Robert Allison posted a map that shows the average commute times for major US cities, along with the proportion of the commute that is attributed to traffic jams and other congestion. The data are from a CEOs for Cities report (Driven Apart, 2010, p. 45). Robert use SAS/GRAPH software to
Ah! The joys of sets! It is easy to test whether two vectors are equal in SAS/IML software. It is only slightly more challenging to test whether two sets are equal. Recall that A and B are equal as sets if they contain the same elements. Order does not matter.
I needed to construct a string to use in the title of a scatter plot. The scatter plot showed a line, and I wanted to include the equation of the line in the plot's title. This article shows how to construct a string that contains the equation in a readable