Tag: Simulation

Learn SAS | Programming Tips
Rick Wicklin 0
Simulate lognormal data in SAS

A SAS customer asked how to simulate data from a three-parameter lognormal distribution as specified in the PROC UNIVARIATE documentation. In particular, he wanted to incorporate a threshold parameter into the simulation. Simulating lognormal data is easy if you remember an important fact: if X is lognormally distributed, then Y=log(X)

Rick Wicklin 0
The contaminated normal distribution

How can you generate data that contains outliers in a simulation study? The contaminated normal distribution is a simple but useful distribution you can use to simulate outliers. The distribution is easy to explain and understand, and it is also easy to implement in SAS. What is a contaminated normal

Rick Wicklin 0
Sampling variation in small random samples

Somewhere in my past I encountered a panel of histograms for small random samples of normal data. I can't remember the source, but it might have been from John Tukey or William Cleveland. The point of the panel was to emphasize that (because of sampling variation) a small random sample

Rick Wicklin 0
Create patterns of missing data

When simulating data or testing algorithms, it is useful to be able to generate patterns of missing data. This article shows how to generate random and systematic patterns of missing values. In other words, this article shows how to replace nonmissing data with missing data. Generate a random pattern of

Rick Wicklin 0
Simulate data from a generalized Gaussian distribution

Although statisticians often assume normally distributed errors, there are important processes for which the error distribution has a heavy tail. A well-known heavy-tailed distribution is the t distribution, but the t distribution is unsuitable for some applications because it does not have finite moments (means, variance,...) for small parameter values.

Rick Wicklin 0
Generate points uniformly inside a circular region in 2-D

It is easy to generate random points that are uniformly distributed inside a rectangle. You simply generate independent random uniform values for each coordinate. However, nonrectangular regions are more complicated. An instructive example is to simulate points uniformly inside the ball with a given radius. The two-dimensional case is to

Rick Wicklin 0
Four essential sampling methods in SAS

Many simulation and resampling tasks use one of four sampling methods. When you draw a random sample from a population, you can sample with or without replacement. At the same time, all individuals in the population might have equal probability of being selected, or some individuals might be more likely

Rick Wicklin 0
Monte Carlo simulation for contingency tables in SAS

The FREQ procedure in SAS supports computing exact p-values for many statistical tests. For small and mid-sized problems, the procedure runs very quickly. However, even though PROC FREQ uses efficient methods to avoid unnecessary computations, the computational time required by exact tests might be prohibitively expensive for certain tables. If

Rick Wicklin 0
Models and simulation for 2x2 contingency tables

When modeling and simulating data, it is important to be able to articulate the real-life statistical process that generates the data. Suppose a friend says to you, "I want to simulate two random correlated variables, X and Y." Usually this means that he wants data generated from a multivariate distribution,

Rick Wicklin 0
How to generate random integers in SAS

I was recently talking with some SAS customers and someone started talking about generating random numbers. I was asked "Why can't SAS create an easy way to generate random numbers? Excel has a simple way to generate random numbers between 1 and 100, and I use it all the time."

Rick Wicklin 0
Balls and urns Part 2: Multi-colored balls

In a previous post I described how to simulate random samples from an urn that contains colored balls. The previous article described the case where the balls can be either of two colors. In that csae, all the distributions are univariate. In this article I examine the case where the

1 2 3 4