Quick trick: Compute the proportion of success in a binary variable

0

In simulation studies, the response variable is often a binary (or Bernoulli) variable. Often 1 is used to indicate "success" (or the occurrence of an event) whereas 0 indicates "failure" (or the absence of an event).

For example, the following SAS/IML statements define a vector x of zeros and ones:

proc iml;
x = {0,1,1,1,0,0,1,1,1,1};

If you want to find the proportion of ones in the vector, you could sum up the ones and divide by the length of the vector:

prop = sum(x = 1) / nrow(x);

However, when you think about it, the logical expression x=1 is equivalent to x itself, because the logical expression is 1 when x equals 1 and is otherwise zero. Therefore the proportion of ones in a binary vector is simply sum(x)/nrow(x), which is equivalent to the mean of x. For the example, the proportion of ones (0.7) is the same as the sum of the values (7) divided by the sample size (10), which is the mean.

Consequently, a simpler expression that computes the mean of a binary vector is as follows:

prop = x[:]; /* mean of x */

This expression also correctly handles missing values in the x vector, whereas the original expression does not.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Leave A Reply

Back to Top