Rick Wicklin, Author at The DO Loop

Rick WicklinNovember 2, 2020 2

Tips to simulate binary and categorical variables

When there are two equivalent ways to do something, I advocate choosing the one that is simpler and more efficient. Sometimes, I encounter a SAS program that simulates random numbers in a way that is neither simple nor efficient. This article demonstrates two improvements that you can make to your

English

Analytics

Monte Carlo distribution of skewness statistic (B=10000, N=100)

Rick WicklinOctober 28, 2020 4

The sample skewness is a biased statistic

The skewness of a distribution indicates whether a distribution is symmetric or not. The Wikipedia article about skewness discusses two common definitions for the sample skewness, including the definition used by SAS. In the middle of the article, you will discover the following sentence: In general, the [estimators]are both biased

English

Analytics | Programming Tips

Graphical comparison of two methods for estimating confidence intervals of eigenvalues of a correlation matrix

Rick WicklinOctober 26, 2020 3

Confidence intervals for eigenvalues of a correlation matrix

A fundamental principle of data analysis is that a statistic is an estimate of a parameter for the population. A statistic is calculated from a random sample. This leads to uncertainty in the estimate: a different random sample would have produced a different statistic. To quantify the uncertainty, SAS procedures

English

Data Visualization | Programming Tips

Decomposition of a convex polygon into triangles

Rick WicklinOctober 21, 2020 0

Generate random points in a polygon

The triangulation theorem for polygons says that every simple polygon can be triangulated. In fact, if the polygon has V vertices, you can decompose it into V-2 non-overlapping triangles. In this article, a "polygon" always means a simple polygon. Also, a "random point" means one that is drawn at random

English

Analytics | Programming Tips

Rick WicklinOctober 19, 2020 5

Generate random points in a triangle

How can you efficiently generate N random uniform points in a triangular region of the plane? There is a very cool algorithm (which I call the reflection method) that makes the process easy. I no longer remember where I saw this algorithm, but it is different from the "weighted average"

English

Analytics | Data Visualization

Rick WicklinOctober 14, 2020 1

A continuous band plot for visualizing uncertainty in regression predictions

A previous article discusses the confidence band for the mean predicted value in a regression model. The article shows a "graded confidence band plot," which I saw in Claus O. Wilke's online book, Fundamentals of Data Visualization (Section 16.3). It communicates uncertainty in the predictions. A graded band plot is

English

Analytics | Data Visualization

Rick WicklinOctober 12, 2020 1

Visualize uncertainty in regression predictions

You've probably seen many graphs that are similar to the one at the right. This plot shows a regression line overlaid on a scatter plot of some data. Given a value for the independent variable (x), the regression line gives the best prediction for the mean of the response variable

English

Analytics | Programming Tips

Rick WicklinOctober 7, 2020 0

The Poisson-binomial distribution for hundreds of parameters

A previous article shows how to use a recursive formula to compute exact probabilities for the Poisson-binomial distribution. The recursive formula is an O(N2) computation, where N is the number of parameters for the Poisson-binomial (PB) distribution. If you have a distribution that has hundreds (or even thousands) of parameters,

English

Programming Tips

Rick WicklinOctober 5, 2020 0

Trap and map: Trapping invalid values

Finite-precision computations can be tricky. You might know, mathematically, that a certain result must be non-negative or must be within a certain interval. However, when you actually compute that result on a computer that uses finite-precision, you might observe that the value is slightly negative or slightly outside of the

English

Programming Tips

PDF of the Poisson-binomial distribution

Rick WicklinSeptember 30, 2020 0

Density, CDF, and quantiles for the Poisson-binomial distribution

When working with a probability distribution, it is useful to know how to compute four essential quantities: a random sample, the density function, the cumulative distribution function (CDF), and quantiles. I recently discussed the Poisson-binomial distribution and showed how to generate a random sample. This article shows how to compute

English

Blogs

Blogs

Author