Efficient Sampling

7
Recently, SAS Global Forum announced the call for papers for the 2011 conference to be held at Caesars Palace in Las Vegas.

Since the conference is in Las Vegas, I’ve been thinking a lot about games of chance: blackjack, craps, roulette, and the like. You can analyze these games by computing the odds of certain events in a probability model. Or you can be lazy, like me, and let the computer do the work by running a simulation to compute the odds of certain outcomes.

Random samples are the basis of every simulation. If you do not sample efficiently, then your simulation is doomed to take a long time. In SAS/IML software, you can generate random samples by using the RANDGEN subroutine. Novice programmers often use RANDGEN inefficiently and generate one random observation at a time within a loop. Don’t do it!

To generate, say, 1000 random samples from a uniform distribution on [0, 1], simply allocate a vector of length 1000 and call RANDGEN:

proc iml;
u = j(1000, 1);             /** allocate 1000 x 1 vector **/
call randgen(u, "Uniform"); /** random sample from uniform distribution **/

If you want to simulate 1000 dice rolls, transform the uniform variates into the range {1, 2, 3, 4, 5, 6}:

rolls = ceil(6*u);          /** 1 through 6 **/

If you are interested in simulating outcomes from a single-zero roulette wheel, transform the uniform variates into the range {0, 1, 2, …, 36}:

roulette = floor(37*u);     /** 0 through 36 **/

Incidentally, Caesars Palace has one single-zero roulette wheel on the main casino floor. The single-zero wheels lessen the house advantage as compared with the more ubiquitous double-zero wheels, but you won’t see me playing either version of roulette. I agree with Einstein who supposedly said, “You cannot beat a roulette table unless you steal money from it.” However, by using the RANDGEN subroutine, I can quickly simulate my losses for any betting scheme that I might be tempted to employ.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

7 Comments

  1. Ha! In "The DO Loop" blog, we are told "Don't do a do loop." The irony here is sweet. I believe that Rick would concur that whenever possible, avoid do loops and vectorize processing.

    But, perhaps in this instance, Rick should suggest employing a do loop so that the person who wants to simulate their gaming experience will be slowed down. You can't lose as much if you don't play as much! If we have to wait longer for our simulation to complete, we won't suffer as much at the gaming tables.

  2. So you're saying we should write inefficient code to slow down the rate at which people lose money from gambling? That's funny. We could also use the "super-slow-motion" approach to simulate stock market crashes, housing bubbles, and other deleterious economic events.

  3. Pingback: Sampling with replacement in SAS

  4. Pingback: How to generate random numbers in SAS - The DO Loop

  5. Pingback: Simulating a random walk - The DO Loop

  6. Pingback: Eight tips to make your simulation run faster - The DO Loop

  7. Pingback: How to sample from independent normal distributions - The DO Loop

Leave A Reply

Back to Top