Blogs

Blogs

Tag: Simulation

Programming Tips

Rick WicklinJuly 10, 2013 0

Six reasons you should stop using the RANUNI function to generate random numbers

Are you still using the old RANUNI, RANNOR, RANBIN, and other "RANXXX" functions to generate random numbers in SAS? If so, here are six reasons why you should switch from these older (1970s) algorithms to the newer (late 1990s) Mersenne-Twister algorithm, which is implemented in the RAND function. The newer

Read More

Rick WicklinJuly 3, 2013 0

Duplicate values in a stream of random numbers

As I wrote in my previous post, a SAS customer noticed that he was getting some duplicate values when he used the RAND function to generate a large number of random uniform values on the interval [0,1]. He wanted to know if this result indicates a bug in the RAND

Read More

Rick WicklinJuly 1, 2013 0

Duplicate values in random numbers: Tossing dice and sharing birthdays

Tossing dice is a simple and familiar process, yet it can illustrate deep and counterintuitive aspects of random numbers. For example, if you toss four identical six-sided dice, what is the probability that the faces are all distinct, as shown to the left? Many people would guess that the probability

Read More

Advanced Analytics

Rick WicklinJune 5, 2013 0

Using simulation to compute a power curve

Last week I showed how to use simulation to estimate the power of a statistical test. I used the two-sample t test to illustrate the technique. In my example, the difference between the means of two groups was 1.2, and the simulation estimated a probability of 0.72 that the t

Read More

Advanced Analytics

Rick WicklinMay 30, 2013 0

Using simulation to estimate the power of a statistical test

The power of a statistical test measures the test's ability to detect a specific alternate hypothesis. For example, educational researchers might want to compare the mean scores of boys and girls on a standardized test. They plan to use the well-known two-sample t test. The null hypothesis is that the

Read More

Rick WicklinMay 24, 2013 0

Turn off ODS when running simulations in SAS

In my article "Simulation in SAS: The slow way or the BY way," I showed how to use BY-group processing rather than a macro loop in order to efficiently analyze simulated data with SAS. In the example, I analyzed the simulated data by using PROC MEANS, and I use the

Read More

Rick WicklinApril 25, 2013 0

The new book on my desk

In my constant effort to keep pace with Chris Hemedinger, I am pleased to announce the availability of my new book, Simulating Data with SAS. Chris started a tradition for SAS Press authors to post a photo of themselves with their new book. Thanks to everyone who helped with the

Read More

Rick WicklinApril 10, 2013 0

How to generate multiple samples from the multivariate normal distribution in SAS

A SAS customer asks: How do I use SAS to generate multiple samples of size N from a multivariate normal distribution? Suppose that you want to simulate k samples (each with N observations) from a multivariate normal distribution with a given mean vector and covariance matrix. Because all of the

Read More

Rick WicklinFebruary 20, 2013 0

What happens if you misspecify the parameters for the "Table" distribution?

I have previously written about how to use the "table" distribution to generate random values from a discrete probability distribution. For example, if there are 50 black marbles, 20 red marbles, and 30 white marbles in a box, the following SAS/IML program simulates random draws (with replacement) of 1,000 marbles:

Read More

Learn SAS

Rick WicklinFebruary 4, 2013 0

Simulate discrete variables by using the "Table" distribution

I wanted to write a blog post about the "Table distribution" in SAS. The Table distribution, which is supported by the RAND and the RANDGEN function, enables you to specify the probability of selecting each of k items. Therefore you can use the Table distribution to sample, with replacement, from

Read More

Advanced Analytics

Rick WicklinJanuary 16, 2013 0

Generate binary outcomes with varying probability

A while ago I saw a blog post on how to simulate Bernoulli outcomes when the probability of generating a 1 (success) varies from observation to observation. I've done this often in SAS, both in the DATA step and in the SAS/IML language. For example, when simulating data that satisfied

Read More

Rick WicklinNovember 21, 2012 0

Efficient acceptance-rejection simulation: Part II

Last week I wrote about using acceptance-rejection algorithms in vector languages to simulate data. The main point I made is that in a vector language it is efficient to generate many more variates than are needed, with the knowledge that a certain proportion will be rejected. In last week's article,

Read More

Rick WicklinNovember 14, 2012 0

Efficient acceptance-rejection simulation

A few days ago on the SAS/IML Support Community, there was an interesting discussion about how to simulate data from a truncated Poisson distribution. The SAS/IML user wanted to generate values from a Poisson distribution, but discard any zeros that are generated. This kind of simulation is known as an

Read More

Rick WicklinNovember 5, 2012 0

Constructing common covariance structures

I recently encountered a SUGI30 paper by Chuck Kincaid entitled "Guidelines for Selecting the Covariance Structure in Mixed Model Analysis." I think Kincaid does a good job of describing some common covariance structures that are used in mixed models. One of the many uses for SAS/IML is as a language

Read More

Rick WicklinOctober 24, 2012 0

That distribution is quite PERT!

There are a lot of useful probability distributions that are not featured in standard statistical textbooks. Some of them have distinctive names. In the past year I have had contact with SAS customers who use the Tweedie distribution, the slash distribution, and the PERT distribution. Often these distributions are used

Read More

Rick WicklinOctober 8, 2012 0

Generate uniform data in a simplex

It is easy to simulate data that is uniformly distributed in the unit cube for any dimension. However, it is less obvious how to generate data in the unit simplex. The simplex is the set of points (x1,x2,...,xd) such that Σi xi = 1 and 0 ≤ xi ≤ 1

Read More

Advanced Analytics

Rick WicklinSeptember 26, 2012 0

A surprising result: The expected number of uniform variates whose sum exceeds one

I was recently flipping through Ross' Simulation (2006, 4th Edition) and saw the following exercise: Let N be the minimum number of draws from a uniform distribution [until the sum of the variates]exceeds 1. What is the expected value of N? Write a simulation to estimate the expected value. For

Read More

Rick WicklinSeptember 12, 2012 0

When is a correlation matrix not a correlation matrix?

This article is an excerpt from my forthcoming book Simulating Data with SAS. Not every matrix with 1 on the diagonal and off-diagonal elements in the range [–1, 1] is a valid correlation matrix. A correlation matrix has a special property known as positive semidefiniteness. All correlation matrices are positive

Read More

Rick WicklinJuly 18, 2012 0

Simulation in SAS: The slow way or the BY way

Over the past few years, and especially since I posted my article on eight tips to make your simulation run faster, I have received many emails (often with attached SAS programs) from SAS users who ask for advice about how to speed up their simulation code. For this reason, I

Read More

Rick WicklinJune 29, 2012 0

Is using zero as a random number seed the same as not specifying a seed?

I received the following query regarding the RAND function in Base SAS: In SAS, is specifying 0 as a random number seed the same as not specifying a seed at all? The question concerns initializing the SAS random number stream by using the internal system clock. You can do this

Read More

Programming Tips

Rick WicklinJune 6, 2012 0

Eight tips to make your simulation run faster

"Help! My simulation is taking too long to run! How can I make it go faster?" I frequently talk with statistical programmers who claim that their "simulations are too slow" (by which they mean, "they take too long"). They suspect that their program is inefficient, but they aren't sure why.

Read More

Rick WicklinMay 16, 2012 0

The curious case of random eigenvalues

I've been a fan of statistical simulation and other kinds of computer experimentation for many years. For me, simulation is a good way to understand how the world of statistics works, and to formulate and test conjectures. Last week, while investigating the efficiency of the power method for finding dominant

Read More

Mike GillilandApril 23, 2012 0

Forecasting and analytics at Disney World

The April 2012 issue of ORMS Today contains a piece on "How analytics enhance the guest experience at Walt Disney World," by Pete Buczkowski and Hai Chu. While many of us are used to forecasting just one or two things (such as unit sales or revenue), Pete and Hai illustrate

Read More

Rick WicklinMarch 30, 2012 0

Generate a random matrix with specified eigenvalues

In a previous post I showed how to implement Stewart's (1980) algorithm for generating random orthogonal matrices in SAS/IML software. By using the algorithm, it is easy to generate a random matrix that contains a specified set of eigenvalues. If D = diag(λ1, ..., λp) is a diagonal matrix and

Read More

Rick WicklinMarch 28, 2012 0

Generating a random orthogonal matrix

Because I am writing a new book about simulating data in SAS, I have been doing a lot of reading and research about how to simulate various quantities. Random integers? Check! Random univariate samples? Check! Random multivariate samples? Check! Recently I've been researching how to generate random matrices. I've blogged

Read More

Rick WicklinMarch 23, 2012 0

The curse of dimensionality: How to define outliers in high-dimensional data?

After my post on detecting outliers in multivariate data in SAS by using the MCD method, Peter Flom commented "when there are a bunch of dimensions, every data point is an outlier" and remarked on the curse of dimensionality. What he meant is that most points in a high-dimensional cloud

Read More

Rick WicklinMarch 21, 2012 0

Creating symmetric matrices: Two useful functions with strange names

Covariance, correlation, and distance matrices are a few examples of symmetric matrices that are frequently encountered in statistics. When you create a symmetric matrix, you only need to specify the lower triangular portion of the matrix. The VECH and SQRVECH functions, which were introduced in SAS/IML 9.3, are two functions

Read More

Advanced Analytics

Use the Cholesky transformation to correlate and uncorrelate variables

Rick WicklinFebruary 8, 2012 0

Use the Cholesky transformation to correlate and uncorrelate variables

A variance-covariance matrix expresses linear relationships between variables. Given the covariances between variables, did you know that you can write down an invertible linear transformation that "uncorrelates" the variables? Conversely, you can transform a set of uncorrelated variables into variables with given covariances. The transformation that works this magic is

Read More

Rick WicklinJanuary 30, 2012 0

Random number seeds: Only the first seed matters!

The other day I encountered the following SAS DATA step for generating three normally distributed variables. Study it, and see if you can discover what is unnecessary (and misleading!) about this program: data points; drop i; do i=1 to 10; x=rannor(34343); y=rannor(12345); z=rannor(54321); output; end; run; The program creates the

Read More

Rick WicklinJanuary 11, 2012 0

How to lie with a simulation

In my article on Buffon's needle experiment, I showed a graph that converges fairly nicely and regularly to the value π, which is the value that the simulation is trying to estimate. This graph is, indeed, a typical graph, as you can verify by running the simulation yourself. However, notice

Read More

Previous 1 … 4 5 6 7 Next