Blogs

Blogs

Author

Rick Wicklin RSS
Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Advanced Analytics | Machine Learning

Rick WicklinMay 5, 2025 0

Implement a SMOTE simulation algorithm in SAS

A recent article describes the main features of simulation by using the Synthetic Minority Over-sampling Technique (SMOTE). SMOTE was created to oversample from a set of rare events prior to running a machine learning classification algorithm. However, at its heart, the SMOTE algorithm (Chawla et al., 2002) provides a way

Read More

Learn SAS | Machine Learning | Programming Tips

Rick WicklinApril 28, 2025 0

The SMOTE method for generating synthetic data

The Synthetic Minority Over-sampling Technique (SMOTE) was created to address class-imbalance problems in machine learning algorithms. The idea is to oversample from the rare events prior to running a machine learning classification algorithm. However, at its heart, the SMOTE algorithm (Chawla et al., 2002) is essentially a way to simulate

Read More

Analytics | Data Visualization | Programming Tips

Rick WicklinApril 21, 2025 0

Updating old SAS programs: A case study with PROC LOESS

SAS programmers love to brag that the SAS will still run a program they wrote twenty or forty years. This is both a blessing and a curse. It's a blessing because it frees the statistical programmer from needing to revisit and rewrite code that was written long ago. It's a

Read More

Analytics | Programming Tips

Rick WicklinApril 14, 2025 0

Newton's minimization method

Isaac Newton had many amazing scientific and mathematical accomplishments. His law of universal gravitation and his creation of calculus are at the top of the list! But in the field of numerical analysis, "Newton's Method" was a groundbreaking advancement for solving for a root of a nonlinear smooth function. The

Read More

Learn SAS | Programming Tips

Rick WicklinApril 8, 2025 0

The Golden Section minimization method

Newton's method was in the news this week. Not the well-known linear method for finding roots, but a more complicated method for finding minima, sometimes called the method of successive parabolic approximations. Newton's parabolic method was recently improved by modern researchers who extended the method to use higher-dimensional polynomials. The

Read More

Analytics

Rick WicklinApril 2, 2025 0

Did George Box say, "All models are wrong, but some are useful"?

Nearly every statistician has heard the aphorism, "All models are wrong, but some are useful." The quote is attributed to George Box, an early and influential thinker about statistics. Did George Box actually say this quote? Yes, he did. The first part of the quote ("All models are wrong") appeared

Read More

Data Visualization | Programming Tips

Rick WicklinMarch 31, 2025 0

Nested bar charts in SAS

After giving a talk about how to create effective statistical graphics in SAS, I was asked a question: "When do you suggest using the graph template language (GTL) to build graphs?" I replied that I turn to the GTL when I cannot create the graph I want by using PROC

Read More

Analytics | Data Visualization | Programming Tips

Rick WicklinMarch 24, 2025 0

The quantile fit plot: Comparing empirical and predicted quantiles for a univariate model

A common task in statistics is to model data by using a parametric probability distribution, such as the normal, lognormal, beta, or gamma distributions. There are many ways to assess how well the model fits the data, including graphical methods such as a Q-Q plot and formal statistical tests such

Read More

Learn SAS | Programming Tips

Rick WicklinMarch 17, 2025 0

Find the location of a run-time error in a user-defined function in SAS

Every programmer makes errors. Therefore, learning to debug a program is an important part of learning to program. Another skill is learning to decipher cryptic error messages, which can be as hard to interpret as hieroglyphs. One helpful skill is learning to navigate a "traceback" error. A traceback error message

Read More

Analytics | Programming Tips

Rick WicklinMarch 10, 2025 0

Pi to the power of pi

Happy Pi Day! Every year on March 14th (written 3/14 in the US), people in the mathematical sciences celebrate "all things pi-related" because 3.14 is the three-decimal approximation to π ≈ 3.14159265358979.... The purpose of this day is to have fun, celebrate the importance of mathematics, and maybe learn a

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinMarch 3, 2025 0

An explicit formula for eigenvalues of an AR(1) correlation matrix

The first-order autoregressive (AR(1)) correlation structure is important for applications in time series modeling and for repeated measures analysis. The AR(1) model provides a simple situations where measurements (on the same subject) that are closer in time are correlated more strongly than measurements recorded far apart. The AR(1) model uses

Read More

Analytics | Learn SAS

Rick WicklinFebruary 24, 2025 0

Use the EFFECTPLOT statement to visualize binomial regression models in SAS

In a binomial regression model, the response variable is the proportion of successes for a given number of trials. In SAS regression procedures, you specify a binomial model by using the EVENTS/TRIALS syntax on the MODEL statement. Many analysts use the LOGISTIC or GENMOD procedures to fit binomial models. Visualizing

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinFebruary 17, 2025 0

Deviance residuals and the DEVIANCE function in SAS

Many people have an intuitive feel for residuals in least square models and know that the sum of squared residuals is a goodness-of-fit measure. Generalized linear regression models use a different but related idea, called deviance residuals. What are deviance residuals, and how can you compute them? Deviance residuals (and

Read More

Learn SAS | Programming Tips

Rick WicklinFebruary 10, 2025 0

Find inflection points for a function that is known only at discrete points

A previous article describes how to use SAS to find the inflection points of a 1-D function that you can evaluate at any point. The function must be given by a formula (or by an algorithm) because the root-finding algorithm needs to evaluate the function at arbitrary locations. However, sometimes

Read More

Learn SAS | Programming Tips

Rick WicklinFebruary 5, 2025 0

Find an inflection point for a function numerically

A SAS programmer asked if it is possible to numerically find an inflection point for a univariate function, f(x). Yes! This can be solved as a variation of a classic numerical root-finding problem. Recall that an inflection point is a value (call it x0) in the domain where the graph

Read More

Analytics | Programming Tips

Rick WicklinFebruary 3, 2025 0

Use the Lambert W function to solve equations that involve exponential functions

I previously wrote an article about the Lambert W function. The Lambert W function is the inverse of the function g(x) = x exp(x). This means that you can use it to find the value of x such that g(x)=c for any value of c in the range of g, which

Read More

Learn SAS | Programming Tips

Rick WicklinJanuary 27, 2025 0

Find the real roots of polynomials in SAS

A SAS programmer had many polynomials for which he wanted to compute the real roots. By the Fundamental Theorem of Algebra, every polynomial of degree d has d complex roots. You can find these complex roots by using the POLYROOT function in SAS IML. The programmer only wanted to output

Read More

Learn SAS | Programming Tips

Rick WicklinJanuary 22, 2025 0

SAS tip: Use the hyphen and colon operators to specify multiple data sets on the SET statement

Here's a SAS tip for you. Most SAS programmers know that SAS provides syntax that makes it easy to specify a list of variables. For example, you can use the hyphen and colon operators to specify lists of variables on many SAS statements: You can use the hyphen operator (-)

Read More

Advanced Analytics | Data Visualization

Rick WicklinJanuary 15, 2025 0

Visualize correlation matrices that have the same eigenvalues

A colleague asked me an interesting question: Suppose you have a structured correlation matrix, such as a matrix that has a compound symmetric, banded, or an AR1(ρ) structure. If you generate a random correlation matrix that has the same eigenvalues as the structured matrix, does the random matrix have the

Read More

Analytics | Data Visualization | Learn SAS | Programming Tips

Rick WicklinJanuary 13, 2025 0

12 blog posts from 2024 that deserve a second look

In a previous article, I presented some of the most popular blog posts from The DO Loop in 2024. In general, popular articles deal with elementary topics that have broad appeal. However, I also write technical articles about advanced topics, which typically do not make it onto a Top 10

Read More

Advanced Analytics

Geometric interpretation of the singular value decomposition (SVD) as the product of a rotation/reflection, followed by a scaling, followed by another rotation/reflection.

Rick WicklinJanuary 8, 2025 0

Matrix norms and spectra

A previous article discusses covariance matrices that have the same set of eigenvalues. The set of eigenvalues is called the spectrum of the matrix. For symmetric matrices, the spectrum contains real numbers. For covariance matrices, which are positive semidefinite, the eigenvalues are nonnegative. It turns out that two symmetric matrices

Read More

Analytics | Data Visualization | Learn SAS | Programming Tips

Rick WicklinJanuary 6, 2025 0

Top 10 posts from The DO Loop in 2024

In 2024, I wrote about 80 articles for The DO Loop blog. My most popular articles were about SAS programming, data visualization, and statistics. If you missed any of these articles, here is the "Reader's Choice Awards" for some of the most popular articles from 2024! SAS Programming The following

Read More

Analytics | Programming Tips

Rick WicklinDecember 18, 2024 0

Generate correlation matrices with specified eigenvalues

A previous article discusses how to generate a random covariance matrix with a specified set of (positive) eigenvalues. A SAS programmer asked whether it is possible to produce a correlation matrix that has a specified set of eigenvalues. After discussing the problem with a friend, I am happy to report

Read More

Data Visualization | Learn SAS | Programming Tips

Rick WicklinDecember 16, 2024 0

A normal Christmas tree

O Christmas tree, O Christmas tree, How lovely are your branches! SAS programmers have a long history of creating yuletide-themed graphics. Christmas trees are a popular image because of their simplicity. I admit that I have indulged more than once in this holiday tradition: An old-school ASCII art image A

Read More

Advanced Analytics | Programming Tips

Rick WicklinDecember 9, 2024 0

Latin hypercube sampling in SAS

While researching the topic of Latin hypercube sampling (LHS), I read an article by Emily Gao (2019) that shows how to use PROC IML in SAS to perform the algorithm. It is possible to simplify Gao's implementation of Latin hypercube sampling in SAS while also making the computation more efficient.

Read More

Analytics | Programming Tips

Rick WicklinDecember 2, 2024 0

A historical method of generating random normal variates

Decades ago, it was a challenge to generate (pseudo-) random numbers that had good statistical properties. The proliferation of desktop computers in the 1980s and '90s led to many advances in computational mathematics, including better ways to generate pseudorandom variates from a wide range of probability distributions. (For brevity, I

Read More

Analytics | Data Visualization

Rick WicklinNovember 25, 2024 0

Order variables by using a loading plot

The article "Order two-dimensional vectors by using angles" shows how to re-order a set of 2-D vectors by their angles. Because angles are on a circle, which has no beginning and no end, you must specify which vector will appear first in the list. The previous article finds the largest

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinNovember 20, 2024 0

Order two-dimensional vectors by using angles

Order matters. The order of variables in tables and rows of a correlation matrix can make a big difference in how easy it is to observed correlations between variables or groups of variables. There are many ways to order the variables, but this article shows how to display the variables

Read More

Analytics | Learn SAS

Rick WicklinNovember 18, 2024 0

The correlation between two sets of variables

In a correlation analysis, it is common to consider the correlations between all pairs of numerical variables. That is, if there are k numerical variables, most people examine the complete k x k matrix of correlations. This matrix is symmetric and has 1s on the diagonal, so more than half of the

Read More

Learn SAS | Programming Tips

Rick WicklinNovember 13, 2024 0

A vector-to-string function for SAS IML

A previous article discusses the MakeString function, which you can use to convert an IML character vector into a string. This can be very useful. When I originally wrote the MakeString function, I was disappointed that I could not vectorize the computation. Recently, I learned about the COMBL function in

Read More

1 2 3 … 53 Next