Most homeowners know that large home improvement projects can take longer than you expect. Whether it's remodeling a kitchen, adding a deck, or landscaping a yard, big projects are expensive and subject to a lot of uncertainty. Factors such as weather, the availability of labor, and the supply of materials,
Author
A previous article describes the metalog distribution (Keelin, 2016). The metalog distribution is a flexible family of distributions that can model a wide range of shapes for data distributions. The metalog system can model bounded, semibounded, and unbounded continuous distributions. This article shows how to use the metalog distribution in
Happy Pi Day! Every year on March 14th (written 3/14 in the US), people in the mathematical sciences celebrate "all things pi-related" because 3.14 is the three-decimal approximation to π ≈ 3.14159265358979.... Modern computer methods and algorithms enable us to calculate 100 trillion digits of π. However, I think it
Undergraduate textbooks on probability and statistics typically prove theorems that show how the variance of a sum of random variables is related to the variance of the original variables and the covariance between them. For example, the Wikipedia article on Variance contains an equation for the sum of two random
A SAS programmer wanted to compute the distribution of X-Y, where X and Y are two beta-distributed random variables. Pham-Gia and Turkkan (1993) derive a formula for the PDF of this distribution. Unfortunately, the PDF involves evaluating a two-dimensional generalized hypergeometric function, which is not available in all programming languages.
In applied mathematics, there is a large collection of "special functions." These function appera in certain applications, often as the solution to a differential equation, but also in the definition of probability distributions. For example, I have written about Bessel functions, the complete and incomplete gamma function, and the complete
The metalog family of distributions (Keelin, Decision Analysis, 2016) is a flexible family that can model a wide range of continuous univariate data distributions when the data-generating mechanism is unknown. This article provides an overview of the metalog distributions. A subsequent article shows how to download and use a library
A SAS programmer asked for help to simulate data from a distribution that has certain properties. The distribution must be supported on the interval [a, b] and have a specified mean, μ, where a < μ < b. It turns out that there are infinitely many distributions that satisfy these
You can use a Markov transition matrix to model the transition of an entity between a set of discrete states. A transition matrix is also called a stochastic matrix. A previous article describes how to use transition matrices for stochastic modeling. You can estimate a Markov transition matrix by using
SAS programmers love to make special graphs for Valentine's Day. In fact, there is a long history of heart-shaped graphs and love-inspired programs written in SAS! Last year, I added to the collection by showing how a ball bounces on a heart-shaped billiards table. This year, I create a similar
This article is about how to use Git to share SAS programs, specifically how to share libraries of SAS IML functions. Some IML programmers might remember an earlier way to share libraries of functions: SAS/IML released "packages" in SAS 9.4m3 (2015), which enable you to create, document, share, and use
SAS supports the ColorBrewer system of color palettes from the ColorBrewer website (Brewer and Harrower, 2002). The ColorBrewer color ramps are available in SAS by using the PALETTE function in SAS IML software. The PALETTE function supports all ColorBrewer palettes, but some palettes are not interpretable by people with color
Did you know that about 8% of the world's men are colorblind? (More correctly, 8% of men are "color vision deficient," since they see colors, but not all colors.) Because of the "birthday paradox," in a room that contains eight men, the probability is 50% that at least one is
I previously discussed how to use the PUTLOG statement to write a message from the DATA step to the log in SAS. The PUTLOG statement is commonly used to write notes, warnings, and errors to the log. This article shows how to use the PRINTTOLOG subroutine in SAS IML software
Many experienced SAS programmers use the PUT statement to write messages to the log from a DATA step. But did you know that SAS supports the PUTLOG function, which is another way to write a message to the log? I use the PUTLOG statement in the DATA step for the
A previous article shows that you can use the Intercept parameter to control the ratio of events to nonevents in a simulation of data from a logistic regression model. If you decrease the intercept parameter, the probability of the event decreases; if you increase the intercept parameter, the probability of
This article shows that you can use the intercept parameter to control the probability of the event in a simulation study that involves a binary logistic regression model. For simplicity, I will simulate data from a logistic regression model that involves only one explanatory variable, but the main idea applies
In a previous article, I presented some of the most popular blog posts from 2022. In general, popular articles deal with elementary topics that have broad appeal. However, I also write articles about advanced topics. The following articles didn't make the Top 10 list, but they deserve a second look.
Since 2008, SAS has supported an interface for calling R from the SAS/IML matrix language. Many years ago, I wrote blog posts that describe how to call R from PROC IML. For SAS 9.4, the process of installing R and calling R from PROC IML is documented in the SAS/IML
Last year, I wrote almost 90 articles for The DO Loop blog. My most popular articles were about SAS programming, data visualization, statistics and data analysis, and matrix computations. If you missed these articles when I published them—or if you want to read them again!— here is the "Reader's Choice
A colleague posted a Christmas-themed code snippet that shows how to use the DATA step in SAS to output all the possible ways that Santa can hitch up a team of reindeer to pull his sled. The assumption is that Rudolph must lead the team, and the remaining reindeer are
A previous article describes how to use SAS IML software to construct common covariance structures that are encountered in mixed models. Each covariance matrix has several parameters, and you want to construct a matrix for any choice of the parameters. After you have constructed the covariance matrix, you can use
I always emphasize efficiency in statistical programming. I have previously written about why you should never multiply with a large diagonal matrix in the SAS IML language. The reason is that it is more efficient to use elementwise multiplication than matrix multiplication. Specifically, if d is a column vector, then
For Christmas 2021, I wrote an article about palettes of Christmas colors, chiefly shades of red, green, silver, and gold. One of my readers joked that she would like to use my custom palette to design her own Christmas wrapping paper! I remembered her jest when I saw some artwork
A probabilistic card trick is a trick that succeeds with high probability and does not require any skill from the person performing the trick. I have seen a certain trick mentioned several times on social media. I call it "ladders" or the "ladders game" because it reminds me of the
A SAS programmer was trying to simulate poker hands. He was having difficulty because the sampling scheme for simulating card games requires that you sample without replacement for each hand. In statistics, this is called "simple random sampling." If done properly, it is straightforward to simulate poker hands in SAS.
Recently, I needed to know "how much" of a piecewise linear curve is below the X axis. The coordinates of the curve were given as a set of ordered pairs (x1,y1), (x2,y2), ..., (xn, yn). The question is vague, so the first step is to define the question better. Should
A profile plot is a way to display multivariate values for many subjects. The optimal linear profile plot was introduced by John Hartigan in his book Clustering Algorithms (1975). In Michael Friendly's book (SAS System for Statistical Graphics, 1991), Friendly shows how to construct an optimal linear profile by using
A profile plot is a compact way to visualize many variables for a set of subjects. It enables you to investigate which subjects are similar to or different from other subjects. Visually, a profile plot can take many forms. This article shows several profile plots: a line plot of the
I recently blogged about how to compute the area of the convex hull of a set of planar points. This article discusses the expected value of the area of the convex hull for n random uniform points in the unit square. The article introduces an exact formula (due to Buchta,