Blog posts from 2023 that deserve a second look

0

In a previous article, I presented some of the most popular blog posts from 2023. The popular articles tend to discuss elementary topics that have broad appeal. However, I also wrote many technical articles about advanced topics. The following articles didn't make the Top 10 list, but they deserve a second look.

In the classic Christmas movie, Elf, Buddy the Elf explains that "elves try to stick to the four main food groups: candy, candy canes, candy corns and syrup." It's a funny line because, to a non-elf, these all seem like the same thing: sugar, and lots of it! Similarly, I have organized the blog posts into four main groups: statistics, statistical graphics, statistical programming, and SAS. If these groups seem like "computational statistics, and lots of it," I will not attempt to dissuade you.

Statistics (and probability)

A Metalog Model

Sometimes I read a journal article or book that is so interesting, I feel compelled to share the idea and show how to implement it in SAS.

Statistical graphics

A Silhouette Plot

I have written many articles about statistical graphics and recently wrote an article about 10 tips for creating effective statistical graphics. In addition, the following articles on statistical graphics show how to create interesting graphs:

  • Silhouette plots: The silhouette statistic (Rousseeuw, 1987) identifies observations in a cluster analysis that are potentially misclassified. The silhouette plot is a panel of bar charts (or histograms) that displays the distribution of the silhouette statistic for each cluster and enables you to assess the overall fit for the clustering method.
  • Log-scale histogram: Histograms estimate the probability density for a variable. If the variable spans several orders of magnitudes, you can use a log-scale for the horizontal axis of a histogram. If the bin counts span several orders of magnitude, you can use a log-scale for the vertical axis. Both situations are potentially confusing and should be handled with care. This article discusses the advantages and potential pitfalls of a log-scale histogram.
  • Prediction intervals in regression models: Although many SAS regression procedures create a confidence band for the predicted value of a regression model, if you use bootstrapping or another estimation method, you might need to manually create a graph that visualizes the prediction limits. This article shows how to visualize confidence limits for the predicted mean in a regression model.

Statistical programming

One of the goals of my blog is to show readers how to compute quantities or estimate statistics that are not directly obtainable by calling a SAS procedure. This requires writing programs. Although the SAS DATA step and PROC FCMP are powerful tools, for advanced programming, I use the SAS IML language, which enables high-level matrix-vector programming.

Acceptance-Rejection Envelope

SAS

An Image in a Graph

Almost every article I write includes some sort of SAS programming. Sometimes the task requires a bit of ingenuity to combine several techniques.

Your turn

Did I omit one of your favorite blog posts from The DO Loop in 2023? If so, leave a comment and tell me what topic you found interesting or useful.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Leave A Reply

Back to Top