The Wishart distribution: Covariance matrices for multivariate normal data

I've written about how to generate a sample from a multivariate normal (MVN) distribution in SAS by using the RANDNORMAL function in SAS/IML software. Last week a SAS/IML programmer showed me a program that simulated MVN data and computed the resulting covariance matrix for each simulated sample. The purpose of […]
Post a Comment

What is the coefficient of variation?

I sometimes wonder whether some functions and options in SAS software ever get used. Last week I was reviewing new features that were added to SAS/IML 13.1. One of the new functions is the CV function, which computes the sample coefficient of variation for data. Maybe it is just me, […]
Post a Comment

Binning data by quantiles? Beware of rounded data

In my article about how to create a quantile plot, I chose not to discuss a theoretical issue that occasionally occurs. The issue is that for discrete data (which includes rounded values), it might be impossible to use quantile values to split the data into k groups where each group […]
Post a Comment

Does this kurtosis make my tail look fat?

What is kurtosis? What does negative or positive kurtosis mean, and why should you care? How do you compute kurtosis in SAS software? It is not clear from the definition of kurtosis what (if anything) kurtosis tells us about the shape of a distribution, or why kurtosis is relevant to […]
Post a Comment

Fat-tailed and long-tailed distributions

The tail of a probability distribution is an important notion in probability and statistics, but did you know that there is not a rigorous definition for the "tail"? The term is primarily used intuitively to mean the part of a distribution that is far from the distribution's peak or center. […]
Post a Comment

Stigler's seven pillars of statistical wisdom

Wisdom has built her house; She has hewn out her seven pillars.      – Proverbs 9:1 At the 2014 Joint Statistical Meetings in Boston, Stephen Stigler gave the ASA President's Invited Address. In forty short minutes, Stigler laid out his response to the age-old question "What is statistics?" His answer was […]
Post a Comment

Santa Claus, statistics, and understanding uncertainty

As the International Year of Statistics comes to a close, I've been reflecting on the role statistics plays in our modern society. Of course, statistics provides estimates, forecasts, and the like, but to me the great contribution of statistics is that it enables us to deal with uncertainty in a […]
Post a Comment

Why it's okay to guess on the SAT test

Should you ever guess on the SAT® or PSAT standardized tests? My son is getting ready to take the preliminary SAT (PSAT), which is a practice test for the SAT. A teacher gave his class this advice regarding guessing: For a multiple-choice questions, if you can eliminate one or two […]
Post a Comment

The difference between frequencies and weights in regression analysis

This week I read an interesting blog post that led to a discussion about specifying the frequencies of observations in a regression model. In SAS software, many of the analysis procedures contain a FREQ statement for specifying frequencies and a WEIGHT statement for specifying weights in a weighted regression. Theis […]
Post a Comment

Duplicate values in a stream of random numbers

As I wrote in my previous post, a SAS customer noticed that he was getting some duplicate values when he used the RAND function to generate a large number of random uniform values on the interval [0,1]. He wanted to know if this result indicates a bug in the RAND […]
Post a Comment