Statistical Thinking Archives

Analytics

Rick WicklinApril 2, 2025 1

Did George Box say, "All models are wrong, but some are useful"?

Nearly every statistician has heard the aphorism, "All models are wrong, but some are useful." The quote is attributed to George Box, an early and influential thinker about statistics. Did George Box actually say this quote? Yes, he did. The first part of the quote ("All models are wrong") appeared

English

Analytics | Programming Tips

Rick WicklinDecember 2, 2024 3

A historical method of generating random normal variates

Decades ago, it was a challenge to generate (pseudo-) random numbers that had good statistical properties. The proliferation of desktop computers in the 1980s and '90s led to many advances in computational mathematics, including better ways to generate pseudorandom variates from a wide range of probability distributions. (For brevity, I

English

Analytics | Learn SAS | Programming Tips

Rick WicklinOctober 7, 2024 0

The three-sigma rule

A remarkable result in probability theory is the "three-sigma rule," which is a generic name for theorems that bound the probability that a univariate random variable will appear near the center of its distribution. This article discusses the familiar three-sigma rule for the normal distribution, a less-familiar rule for unimodal

English

Learn SAS | Machine Learning

Rick WicklinJuly 8, 2024 2

On the reproducibility of responses by AI assistants

As announced and demonstrated at SAS Innovate 2024, SAS plans to include a generative AI assistant called SAS Viya Copilot in the forthcoming SAS Viya Workbench. You can submit a text prompt (by putting it in a comment string) and the Copilot will generate SAS code for you. My colleagues

English

Analytics | Data Visualization

Rick WicklinJune 19, 2024 0

Scale a density curve to match a histogram

This article discusses how to scale a probability density curve so that it fits appropriately on a histogram, as shown in the graph to the right. By definition, a probability density curve is scaled so that the area under the curve equals 1. However, a histogram might show counts or

English

Analytics | Learn SAS

Rick WicklinMay 20, 2024 5

On the correctness of a discrete simulation

After writing a program that simulates data, it is important to check that the statistical properties of the simulated (synthetic) data match the properties of the model. As a first step, you can generate a large random sample from the model distribution and compare the sample statistics to the expected

English

Analytics | Programming Tips

Rick WicklinMay 13, 2024 2

The distribution of p-values under the null hypothesis

A SAS statistical programmer recently asked a theoretical question about statistics. "I've read that 'p-values are uniformly distributed under the null hypothesis,'" he began, "but what does that mean in practice? Is it important?" I think data simulation is a great way to discuss the conditions for which p-values are

English

Learn SAS | Programming Tips

Rick WicklinMay 8, 2024 1

Dice and the correctness of a simulation

At a recent conference in Las Vegas, a presenter simulated the sum of two dice and used it to simulate the game of craps. I write a lot of simulations, so I'd like to discuss two related topics: How to simulate the sum of two dice in SAS. This is

English

Analytics | Learn SAS | Programming Tips

Rick WicklinMarch 20, 2024 1

Maximum likelihood estimates for linear regression

A statistical analyst used the GENMOD procedure in SAS to fit a linear regression model. He noticed that the table of parameter estimates has an extra row (labeled "Scale") that is not a regression coefficient. The "scale parameter" is not part of the parameter estimates table produced by PROC REG

English

Analytics | Learn SAS

Rick WicklinJune 7, 2023 0

Visualize the Spearman rank correlation

A previous article explains the Spearman rank correlation, which is a robust cousin to the more familiar Pearson correlation. I've also discussed why you might want to use rank correlation, and how to interpret the strength of a rank correlation. This article gives a short example that helps you to

English

Blogs

Blogs

Tag: Statistical Thinking