About this blog
Rick Wicklin, PhD, is a senior researcher in computational statistics at SAS and is a principal developer of PROC IML and SAS/IML Studio. His areas of expertise include computational statistics, statistical graphics, and modern methods in statistical data analysis. Rick is author of the book Statistical Programming with SAS/IML Software.

Follow @RickWicklin on Twitter.
Subscribe to this blog
Tags
Archives
The curse of dimensionality: How to define outliers in high-dimensional data?
After my post on detecting outliers in multivariate data in SAS by using the MCD method, Peter Flom commented “when there are a bunch of dimensions, every data point is an outlier” and remarked on the curse of dimensionality. What he meant is that most points in a high-dimensional cloud [...]
Post a Comment What is Mahalanobis distance?
I previously described how to use Mahalanobis distance to find outliers in multivariate data. This article takes a closer look at Mahalanobis distance. A subsequent article will describe how you can compute Mahalanobis distance. Distance in standard units In statistics, we sometimes measure “nearness” or “farness” in terms of the [...]
Post a Comment Explaining coincidence
I was on vacation when a family member sidled up to me. “Rick, you’re a statistician…” he began. I knew I was in trouble. He proceeded to tell me the story of Joseph “Newsboy” Moriarty, a New Jersey mobster who rose to prominence and became known as the bookie who [...]
Post a Comment American pre-WW2 attitudes about Germany and Allies
Yesterday, December 7, 1941, a date which will live in infamy… – Franklin D. Roosevelt Today is the 70th anniversary of the Japanese attack on Pearl Harbor. The very next day, America declared war. During a visit to the Smithsonian National Museum of American History, I discovered the results of [...]
Post a Comment What is the chance that a random matrix is singular?
A few sharp-eyed readers questioned the validity of a technique that I used to demonstrate two ways to solve linear systems of equations. I generated a random n x n matrix and then proceeded to invert it, seemingly without worrying about whether the matrix even has an inverse! I responded to the [...]
Post a Comment The birthday controversy: Are more people born in April or September?
My friend Chris posted an analysis of the distribution of birthdays for 236 of his Facebook friends. He noted that more of his friends have birthdays in April than in September. The numbers were 28 for April, but only 25 for September. As I reported in my post on “the [...]
Post a Comment Estimating popularity based on Google searches: Why it's a bad idea
Some people search the Internet for a set of topics and then use the number of search results (“hits”) for each topic to rank the relative popularity of the topics. At the 2011 Joint Statistical Meetings (JSM), I had the opportunity to attend several talks by statisticians from Google and [...]
Post a Comment Statistics and the Casey Anthony Case: False positives and false negatives
Arnold Loewy, professor of criminal law at Texas Tech University, wrote an editorial about the Casey Anthony case that has statistical undertones. Prof. Loewy discusses the fact that there are two kinds of errors that can occur in a court trial: an innocent person can be sent to jail or [...]
Post a Comment Bias and covariance explained to an 11-year-old
I was inspired by Chris Hemedinger’s blog posts about his daughter’s science fair project. Explaining statistics to a pre-teenager can be a humbling experience. My 11-year-old son likes science. He recently set about trying to measure which of three projectile launchers is the most accurate. I think he wanted to [...]
Post a Comment 