Statistical Thinking Archives

Analytics | Learn SAS | Programming Tips

Rick WicklinMarch 20, 2024 1

Maximum likelihood estimates for linear regression

A statistical analyst used the GENMOD procedure in SAS to fit a linear regression model. He noticed that the table of parameter estimates has an extra row (labeled "Scale") that is not a regression coefficient. The "scale parameter" is not part of the parameter estimates table produced by PROC REG

English

Analytics | Learn SAS

Rick WicklinJune 7, 2023 0

Visualize the Spearman rank correlation

A previous article explains the Spearman rank correlation, which is a robust cousin to the more familiar Pearson correlation. I've also discussed why you might want to use rank correlation, and how to interpret the strength of a rank correlation. This article gives a short example that helps you to

English

Learn SAS | Machine Learning

Rick WicklinMay 10, 2023 6

How good is an AI chatbot at SAS programming?

A lot of programmers have been impressed by the ability of ChatGPT, GPT-4, and Bing Chat to write computer programs. Recently, I wrote an article that discusses an elementary programming assignment, called FizzBuzz, which is sometimes used as part of a hiring process to assess a candidate's basic knowledge of

English

Analytics | Learn SAS | Programming Tips

Rick WicklinApril 17, 2023 3

Should you use the Wald confidence interval for a binomial proportion?

The "Teacher’s Corner" of The American Statistician enables statisticians to discuss topics that are relevant to teaching and learning statistics. Sometimes, the articles have practical relevance, too. Andersson (2023) "The Wald Confidence Interval for a Binomial p as an Illuminating 'Bad' Example," is intended for professors and masters-level students in

English

Analytics

Rick WicklinApril 10, 2023 6

Means and medians of subgroups

A journal article listed the mean, median, and size for subgroups of the data, but did not report the overall mean or median. A SAS programmer wondered what, if any, inferences could be made about the overall mean and median for the data. The answer is that you can calculate

English

Analytics

Rick WicklinFebruary 22, 2023 3

What is the metalog distribution?

The metalog family of distributions (Keelin, Decision Analysis, 2016) is a flexible family that can model a wide range of continuous univariate data distributions when the data-generating mechanism is unknown. This article provides an overview of the metalog distributions. A subsequent article shows how to download and use a library

English

Analytics

Rick WicklinSeptember 28, 2022 0

Definitions of moments in probability and statistics

The moments of a continuous probability distribution are often used to describe the shape of the probability density function (PDF). The first four moments (if they exist) are well known because they correspond to familiar descriptive statistics: The first raw moment is the mean of a distribution. For a random

English

Analytics | Programming Tips

Rick WicklinMay 25, 2022 0

How much does a bootstrap estimate depend on the random number stream?

Many modern statistical techniques incorporate randomness: simulation, bootstrapping, random forests, and so forth. To use the technique, you need to specify a seed value, which determines pseudorandom numbers that are used in the algorithm. Consequently, the seed value also determines the results of the algorithm. In theory, if you know

English

Programming Tips

Rick WicklinJanuary 20, 2022 6

How often do different statistical tests agree? A simulation study

Here's a fun problem to think about: Suppose that you have two different valid ways to test a statistical hypothesis. For a given sample, will both tests reject or fail to reject the hypothesis? Or might one test reject it whereas the other does not? The answer is that two

English

Analytics | Data Visualization

Rick WicklinNovember 8, 2021 0

The normal approximation and random samples of the binomial distribution

Recall that the binomial distribution is the distribution of the number of successes in a set of independent Bernoulli trials, each having the same probability of success. Most introductory statistics textbooks discuss the approximation of the binomial distribution by the normal distribution. The graph to the right shows that the

English

Blogs

Blogs

Tag: Statistical Thinking