The DO Loop
Statistical programming in SAS with an emphasis on SAS/IML programs
Pearson's correlation measures the linear association between two variables. Because the correlation is bounded between [-1, 1], the sampling distribution for highly correlated variables is highly skewed. Even for bivariate normal data, the skewness makes it challenging to estimate confidence intervals for the correlation, to run one-sample hypothesis tests ("Is
Toe bone connected to the foot bone, Foot bone connected to the leg bone, Leg bone connected to the knee bone,... — American Spiritual, "Dem Bones" Last week I read an interesting article on Robert Kosara's data visualization blog. Kosara connected the geographic centers of the US zip codes in
This article shows how to simulate data from a mixture of multivariate normal distributions, which is also called a Gaussian mixture. You can use this simulation to generate clustered data. The adjacent graph shows three clusters, each simulated from a four-dimensional normal distribution. Each cluster has its own within-cluster covariance,