The DO Loop

Analytics | Learn SAS

Rick WicklinJune 3, 2020 3

How to estimate the difference between percentiles

I recently read an article that describes ways to compute confidence intervals for the difference in a percentile between two groups. In Eaton, Moore, and MacKenzie (2019), the authors describe a problem in hydrology. The data are the sizes of pebbles (grains) in rivers at two different sites. The authors

English

Advanced Analytics | Machine Learning

Rick WicklinMay 28, 2020 1

Minimizing the Kullback–Leibler divergence

The Kullback–Leibler divergence is a measure of dissimilarity between two probability distributions. An application in machine learning is to measure how distributions in a parametric family differ from a data distribution. This article shows that if you minimize the Kullback–Leibler divergence over a set of parameters, you can find a

English

Analytics | Data Visualization

Rick WicklinMay 6, 2020 3

What does 'flatten the curve' mean? To which curve does it apply?

During this coronavirus pandemic, there are many COVID-related graphs and curves in the news and on social media. The public, politicians, and pundits scrutinize each day's graphs to determine which communities are winning the fight against coronavirus. Interspersed among these many graphs is the oft-repeated mantra, "Flatten the curve!" As

English

Analytics | Programming Tips

Rick WicklinMay 4, 2020 0

Linear interpolation in SAS

SAS programmers sometimes ask about ways to perform one-dimensional linear interpolation in SAS. This article shows three ways to perform linear interpolation in SAS: PROC IML (in SAS/IML software), PROC EXPAND (in SAS/ETS software), and PROC TRANSREG (in SAS/STAT software). Of these, PROC IML Is the simplest to use and

English

Analytics | Data Visualization

Rick WicklinApril 22, 2020 1

Visualize the case fatality rate for COVID-19 in US counties

A previous article describes the funnel plot (Spiegelhalter, 2005), which can identify samples that have rates or proportions that are much different than expected. The funnel plot is a scatter plot that plots the sample proportion of some quantity against the size of the sample. The variance of the sample

English

Analytics | Data Visualization

Rick WicklinApril 20, 2020 5

Use a funnel plot to visualize rates: The case fatality rate for COVID-19 in North Carolina counties

Death is always a difficult topic to discuss, and death has been in the news a lot during this tragic coronavirus pandemic. Many news stories focus on states, counties, or cities that have the most cases or the most deaths. A related statistic is the case fatality rate, which is

English

Data Visualization | Learn SAS

Rick WicklinMarch 30, 2020 0

Smokestack plots: A visualization technique for comparing cumulative curves

A cumulative curve shows the total amount of some quantity at multiple points in time. Examples include: Total sales of songs, movies, or books, beginning when the item is released. Total views of blog posts, beginning when the post is published. Total cases of a disease for different countries, beginning

English

Advanced Analytics | Data Visualization | Programming Tips

Rick WicklinMarch 9, 2020 2

ROC curves for a binormal sample

In a previous article, I discussed the binormal model for a binary classification problem. This model assumes a set of scores that are normally distributed for each population, and the mean of the scores for the Negative population is less than the mean of scores for the Positive population. I

English

Data Visualization | Learn SAS

Rick WicklinMarch 2, 2020 0

Create a deviation plot to visualize values relative to a baseline

A colleague recently posted an article about how to use SAS Visual Analytics to create a circular graph that displays a year's worth of temperature data. Specifically, the graph shows the air temperature for each day in a year relative to some baseline temperature, such as 65F (18C). Days warmer

English

Analytics | Data Visualization

Rick WicklinFebruary 17, 2020 2

Visualize collinearity diagnostics

A previous article shows how to interpret the collinearity diagnostics that are produced by PROC REG in SAS. The process involves scanning down numbers in a table in order to find extreme values. This can be a tedious and error-prone process. Friendly and Kwan (2009) compare this task to a

English

Blogs

Blogs

Tag: Data Analysis