The inverse CDF method for simulating from a distribution

There are many techniques for generating random variates from a specified probability distribution such as the normal, exponential, or gamma distribution. However, one technique stands out because of its generality and simplicity: the inverse CDF sampling technique. If you know the cumulative distribution function (CDF) of a probability distribution, then you can always generate a random sample from that distribution. The inverse CDF technique for generating a random sample uses the fact that a continuous CDF, F, is a one-to-one mapping of the domain of the CDF into the interval (0,1). Therefore, if U is a uniform random variable on (0,1), then X = F^–1(U) has the distribution F.

This article is taken from Chapter 7 of my book Simulating Data with SAS.

To illustrate the inverse CDF sampling technique (also called the inverse transformation algorithm), consider sampling from a standard exponential distribution. The exponential distribution has probability density f(x) = e^–x, x ≥ 0, and therefore the cumulative distribution is the integral of the density: F(x) = 1 – e^–x. This function can be explicitly inverted by solving for x in the equation F(x) = u. The inverse CDF is x = –log(1–u).

The following DATA step generates random values from the exponential distribution by generating random uniform values from U(0,1) and applying the inverse CDF of the exponential distribution. (Of course, the simpler way is to use x = RAND("Expo")!) The UNIVARIATE procedure is used to check that the data follow an exponential distribution. This example comes from Ross (2006, Fourth Edition).

/* Example of using the inverse CDF algorithm to generate variates
   from the exponential distribution */
data Exp(keep=x);
call streaminit(1234);
do i = 1 to 1000;                      /* sample size = 1000 */
   u = rand("Uniform");                /* statistically equivalent to */
   x = -log(1-u);                      /*     x = rand("Expo");       */
   output;
end;
run;
 
ods select histogram;
proc univariate data=Exp;
   histogram x / exponential(sigma=1) endpoints=0 to 10 by 0.5;
run;

In SAS, the QUANTILE function implements the inverse CDF function. That means that you can use the QUANTILE function to generate random variates. For example, the following statement is an equivalent way to use the inverse CDF method to generate exponential random variates:

   u = rand("Uniform"); 
   x = quantile("Expo", u);

Although powerful, this inverse CDF method can be computationally expensive unless you have a formula for the inverse CDF. In SAS the QUANTILE function implements the inverse CDF function, but for many distributions it has to numerically solve for the root of the equation F(x) = u.

The inverse CDF technique is particularly useful when you want to generate data from a truncated distribution. For a distribution F, if you generate uniform random variates on the interval [F(a), F(b)] and then apply the inverse CDF, the resulting values follow the F distribution truncated to [a, b]. For example, to simulate a variate from the truncated normal distribution on [–1.5, 2], use the following statements:

/* Inverse CDF algorithm for truncated normal distribution on [a,b] */
data TruncNormal(keep=x);
Fa = cdf("Normal", -1.5);              /* for a = -1.5 */
Fb = cdf("Normal",  2.0);              /* for b =  2.0 */
call streaminit(1234);
do i = 1 to 1000;                      /* sample size = 1000 */   
   v = Fa + (Fb-Fa)*rand("Uniform");   /* V ~ U(F(a), F(b))  */
   x = quantile("Normal", v);          /* truncated normal on [a,b] */
   output;
end;
run;
 
ods select histogram;
proc univariate data=TruncNormal;
   histogram x / endpoints=-1.5 to 2.0 by 0.25;
run;

WANT MORE GREAT INSIGHTS MONTHLY? | SUBSCRIBE TO THE SAS TECH REPORT

24 Comments

Dede on December 30, 2013 8:56 am

Numerical Solution for the inverse transform method
This question is Not Answered.(Mark as assumed answered)
Dede Atem
Novice
Dede Atem Dec 30, 2013 8:53 AM
Consider the following .
C= know value
U is Uniform (0,1)
F(x)=1-S(x) is well known
a=known value
k=known value
ht-= is known.
The only unknown is X.

I wish to write a SAS code that find X such that the right hand sight is equal the left hand side numerically.
I can not make X the subject but can find a numerical solution.

C*U=F(x) *exp(-(k-a*X)**2 - (k - a**2 * X) exp(-(k -a*X)**2/2t)*F(x)- ht
16 Views Tags: none (add)

- Rick Wicklin on December 31, 2013 6:54 am
  
  This is a root-finding problem. For each given value of U, numerically find the value of X such that
  C*U - RHS(X) = 0
  You can use the FROOT function in SAS/IML, or use a bisection method (search my blog for 'bisection').
  
santhosh kumar on June 11, 2014 5:46 am

how to calaculate icdf for nakagami distribution

- Rick Wicklin on June 15, 2014 7:24 am
  
  As stated on Wikipedia, a Nakagami random variable is just the square root of a gamma random variable. The CDF is given explicitly in terms of the incomplete gamma function, so use the CDF('GAMMA',...) function in SAS for the CDF and the QUANTILE('GAMMA',...) for the icdf.
  
Indranil Ghosh on November 1, 2015 4:07 pm

Hi Rick,

How can one use the inverse cdf method to generate random samples from an unknown probability distribution, whose cdf is not invertible? For example, I have valid one dimensional density which has the following cdf:
F(x)=(exp(theta*(1-exp(-(alpha*x)^(beta))))-1)*[1+ lambda-lambda*[(exp(theta*(1-exp(-(alpha*x)^(beta))))-1)]/((exp(theta)-1))].

Let me know how to tackle this one.
Thanks,
Indranik

- Rick Wicklin on November 2, 2015 5:30 am
  
  A continuous CDF is always invertible, but sometimes there is no formula for the inverse function. As I say in the second-to-last paragraph, in that case you need to use a root-finding method. For each u ~ U(0,1), solve the equation u = F(x) for x. In SAS/IML, you can use the FROOT function to find roots.
  
Pingback: The Lambert W function in SAS - The DO Loop
adam on February 15, 2017 6:27 pm

Hi Rick, does SAS have something like the matlab function EMPRAND (https://www.mathworks.com/matlabcentral/fileexchange/7976-random-number-from-empirical-distribution?requestedDomain=www.mathworks.com)

this function can generate a random number, given an empirical CDF.

Thank!
Adam

- Rick Wicklin on February 16, 2017 5:39 am
  
  Great question. It might not seem obvious, but as I point out in my book, a drawing random sample from the empirical CDF is accomplished through basic bootstrap (re)sampling. If you want the ability to generate random values that are not in the original sample, the technique becomes the smooth bootstrap. If you choose to use a piecewise linear estimate to the ECDF, you get the technique in the article "Approximating a distribution from published quantiles."
  
Ming Sun on March 26, 2017 12:56 pm

hi Rick, thanks for sharing.
Suppose I have propensity score for a bunch of patients, and i have the ECDF of the PScore. if I want to draw a group of patients with similar ECDF from control patients, how can i sample based on a continuous CDF?

- Rick Wicklin on March 26, 2017 4:04 pm
  
  Given the complexity of this question, I suggest you ask it at the SAS Support Community for Statistical Procedures.
  
Ibenegbu Amuche on February 15, 2019 12:32 pm

please sir what is the quantile form of hypertabastic model

- Rick Wicklin on February 15, 2019 5:41 pm
  
  I don't know. Most distributions do not have an explicit inverse in terms of elementary functions.
  
Ragavesh on December 28, 2019 2:30 pm

Hi Dr Rick,
How to obtain the inverse cdf of generalised gaussian distribution?
https://en.wikipedia.org/wiki/Generalized_normal_distribution

- Rick Wicklin on December 28, 2019 5:18 pm
  
  The generalized normal is defined in terms of the incomplete gamma function, which is a scaled version of the gamma distribution. Therefore you can invert the generalized normal CDF by using the quantile function of the gamma distribution. I suggest you do the inversion twice: once for y greater than mu and again for y less than mu. That eliminates the absolute value and the SIGN function.
  
Andrew on May 1, 2020 5:50 pm

hi rick, how would one use the integral transform method, without numerical inversion of for example the Gamma distribution?

- Rick Wicklin on May 2, 2020 5:51 am
  
  Sorry, but I do not understand your question. I suggest you post your question at the SAS Support Communities. If the information in this article is relevant, link to it in your question. In your question, you should explain what you mean by "the integral transform method, without numerical inversion."
  
Philip on May 3, 2020 10:50 pm

Suppose you are tasked with simulating a process where the inter-arrival times are not exponentially distributed, but Gamma(2, λ) under the fixed-count scheme, say 25 events, subject to the constraint that you must use the integral transform method of the Gamma distribution.

Im working using R.

The code below is what i used for an exponential distribution:
1 R> n = 25
2 R> lambda = 10
3 R> u = runif(n,0,1)
4 R> t = -1/lambda*log(1-u)

How would you modify it for a gamma distribution simulation.

Thanks

- Rick Wicklin on May 4, 2020 5:44 am
  
  If you want help with R code, post your question to an R discussion list.
  
Pingback: Tips to simulate binary and categorical variables - The DO Loop
Pingback: The probability integral transform - The DO Loop
Pingback: Construct an envelope function for the acceptance-rejection method - The DO Loop
Pingback: The linear distribution on an interval - The DO Loop
Pingback: Latin hypercube sampling in SAS - The DO Loop

Blogs

Blogs

The inverse CDF method for simulating from a distribution

About Author

24 Comments

Leave A Reply Cancel Reply