I've been working on a new book about Simulating Data with SAS. In researching the chapter on simulation of multivariate data, I've noticed that the probability density function (PDF) of multivariate distributions is often specified in a matrix form. Consequently, the multivariate density can usually be computed by using the SAS/IML matrix language. As an example, this article describes how to compute the multivariate normal probability density for an arbitrary number of variables.

### Compute the multivariate normal PDF

The density for the multivariate distribution centered at μ with covariance Σ is given by the following formula, which I copied from a Wikipedia article:

The argument to the EXP function involves the expression d2=(x-μ)TΣ-1(x-μ), where d is the Mahalanobis distance between the multidimensional point x and the mean vector μ. I have previous written a SAS/IML function that computes the Mahalanobis distance in SAS, so it is simple to construct a SAS/IML function that computes the multivariate normal PDF for any dimension:

 proc iml; /* Cut and paste the definition of the Mahalanobis module. See https://blogs.sas.com/content/iml/2012/02/22/how-to-compute-mahalanobis-distance-in-sas/ */ start MVNormalPDF(x, mu, cov); p = nrow(cov); const = (2*constant("PI"))##(p/2) * sqrt(det(cov)); d = Mahalanobis(x, mu, cov); phi = exp( -0.5*d#d ) / const; return( phi ); finish;

The MVNormalPDF function is essentially an implementation of the PDF formula, except that the (efficient) Mahalanobis function is used instead of explicitly forming the expression (x-μ)TΣ-1(x-μ). Incidentally, the formula assumes that x and μ are a column vectors. However, the Mahalanobis and MVNormalPDF functions assume that mu is a row vector, and evaluates the PDF on each row of the x matrix.

### Evaluating the multivariate normal PDF

Although it is not apparent at first glance, the MVNormalPDF function is vectorized, which means that the x argument can be an N x p matrix that represents N different p-dimensional points. The vectorization results from the fact that the Mahalanobis function is vectorized.

This means that you can evaluate the PDF at many points by using a single call, as shown below for a three-dimensional example:

 Mu = {1 2 3}; Cov = {3 2 1, 2 4 0, 1 0 5}; x = {0 0 0, 0 1 2, 2 1 2, 1 2 3}; pdf = MVNormalPDF(x, Mu, Cov); print x[c=("x1":"x3")] pdf;

### Visualize the bivariate normal PDF

In the case of two variables, you can visualize the bivariate normal density by creating a surface plot or contour plot. In either case, you need to evaluate the MVNormalPDF function at a grid of (x,y) values. You can use the Define2DGrid function to generate evenly spaced (x,y) values on a uniform grid. The following SAS/IML statements load the Define2DGrid function, evaluate the bivariate normal density at a grid of points, and write the data to a SAS data set:

 /* define or load Define2DGrid module: https://blogs.sas.com/content/iml/2011/01/21/how-to-create-a-grid-of-values/ */ load module=Define2DGrid; Mu = {0 0}; Cov = {4 2, /* correlation coefficient is sqrt(2) */ 2 2}; x = Define2DGrid( -3, 3, 31, -3, 3, 31 ); /* grid is [-3,3]x[-3,3] */ density = MVNormalPDF(x, Mu, Cov);   x1 = x[,1]; x2 = x[,2]; create BivarDensity var {"x1" "x2" "Density"}; append; close BivarDensity; quit;

Earlier this week, I showed how to create a contour plot in SAS by using the Graph Template Language (GTL). Assuming that the ContourPlotParm template has been defined, the following call to the SGRENDER procedure creates a contour plot of the bivariate normal data:

 proc sgrender data=BivarDensity template=ContourPlotParm; dynamic _TITLE="Bivariate Normal Density, corr=sqrt(2) " _X="x1" _Y="x2" _Z="Density"; run;
Share

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

1. amit bothra on

Can we calculate Multivariate Normal Density in base SAS without using IML as I dont have access to it.

• Wikipedia has a formula for the bivariate density that you can program in the DATA step. You can do an internet search for "trivariate normal density" to find the formula for three variables.

2. Abdulbaki Bilgic on

Rick,
Thank you indeed for your answer. I simply tried to replicate your results with the program you posted.
proc iml;
/* define or load Mahalanobis module:
http://blogs.sas.com/content/iml/2012/02/22/how-to-compute-mahalanobis-distance-in-sas/ */
start MVNormlPDF(x, mu, cov);
p = nrow(cov);
const = (2*constant("PI"))##(p/2) * sqrt(det(cov));
d = Mahalanobis(x, mu, cov);
phi = exp( -0.5*d#d ) / const;
return( phi );
finish;
Mu = {1 2 3};
Cov = {3 2 1,
2 4 0,
1 0 5};
x = {0 0 0,
0 1 2,
2 1 2,
1 2 3};
pdf = MVNormalPDF(x, Mu, Cov);
print x[c=("x1":"x3")] pdf;
run;

but in the log window, unfortunately i am getting the same error as i had before.
though Module MVNORMLPDF defined.
i do not know that wrong is with the program or not able to load the module. Thanks.

• Sorry for your confusion. Notice that the comment links to an article in which I define the Mahalanobis module. Go to that URL and paste in the definition of the Mahalanobis function. I have rewritten the comment to make it more clear.

3. Abdulbaki Bilgic on

Rick,
proc iml;
start MVNormalPDF(x, mu, cov);
y = (x-mu)

;
d     = y

* inv(cov) * y;
p = nrow(cov);
const = (2*constant("PI"))##(p/2) * sqrt(det(cov));
phi = exp( -0.5*d#d ) / const;
return( phi );
finish;
Mu = {1 2 3};
Cov = {3 2 1,
2 4 0,
1 0 5};
x = {0 0 0,
0 1 2,
2 1 2,
1 2 3};
pdf = MVNormalPDF(x, Mu, Cov);
print x[c=("x1":"x3")] pdf;
this program works and produces pdf at each x.
Also, in your post, there is a word error between MVNormlPDF in "start" function and pdf = MVNormalPDF(x, Mu, Cov). Thanks a lot.

4. Varun Khandelwal on

Dear Rick,

I tried the corrected code given by Abdulbaki Bilgic. But still it shows error "Matrices do not conform to the operation". I followed your other post as well on the error part but still it is not resolved. Please help me with the code. I have gone through your other posts as well and they are very intuitive and very helpful.

Regards,
Varun

• Whenever you have questions about SAS/IML programming, you can post your question (along with your program and data) to the SAS/IML Support Community.

5. I need to estimate the parameters of a set of bivariate gaussians from data. The data take the form of an n x n identification/confusion matrix where n is the the number of response alternatives. In its most basic form, the stimuli are formed from two dimensions (e.g., height, width), with two levels of each dimension, giving four response alternatives. It is also assumed that there is a decision bound for each dimension, giving a total of a mean vector and covariance matrix for each response alternative, and two decision bounds. Could this code be adapted to this problem, and if so, what would your recommendation be for proceeding. My thanks in advance!