Blogs

Blogs

Tag: Statistical Programming

Learn SAS | Programming Tips

Rick WicklinJuly 24, 2024 0

QPSOLVE: A new SAS IML function for quadratic optimization

Since the pandemic began in 2020, the SAS IML developers have added about 50 new functions and enhancements to the SAS IML language in SAS Viya. Among these functions are new modern methods for optimization that have a simplified syntax as compared to the older 'NLP' functions that are available

Read More

Learn SAS | Programming Tips

Rick WicklinJune 24, 2024 0

Teaching an AI assistant to read and write SAS IML vectors

One of the most exciting features of SAS Viya Workbench is that the code editor includes a generative AI component called SAS Viya Copilot. This feature was announced and demonstrated at SAS Innovate 2024. With the Copilot, you can specify a text prompt that generates SAS code. For example, you

Read More

Analytics | Programming Tips

Rick WicklinApril 29, 2024 0

Bimodal and unimodal beta distributions

In a recent article, I graphed the PDF of a few Beta distributions that had a variety of skewness and kurtosis values. I thought that I had chosen the parameter values to represent a wide variety of Beta shapes. However, I was surprised to see that the distributions were all

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinApril 22, 2024 0

Use the moment-ratio diagram to visualize the sampling distribution of skewness and kurtosis

The moment-ratio diagram is a tool that is useful when choosing a distribution that models a sample of univariate data. As I show in my book (Simulating Data with SAS, Wicklin, 2013), you first plot the skewness and kurtosis of the sample on the moment-ratio diagram to see what common

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinApril 15, 2024 0

Distributions with specified skewness and kurtosis

A SAS programmer wanted to simulate samples from a family of Beta(a,b) distributions for a simulation study. (Recall that a Beta random variable is bounded with values in the range [0,1].) She wanted to choose the parameters such that the skewness and kurtosis of the distributions varied over range of

Read More

Analytics | Learn SAS

Rick WicklinMarch 27, 2024 0

The likelihood ratio test for linear regression in SAS

A recent article describes how to estimate coefficients in a simple linear regression model by using maximum likelihood estimation (MLE). One of the nice properties of an MLE formulation is that you can compare a large model with a nested submodel in a natural way. For example, if you can

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinMarch 20, 2024 0

Maximum likelihood estimates for linear regression

A statistical analyst used the GENMOD procedure in SAS to fit a linear regression model. He noticed that the table of parameter estimates has an extra row (labeled "Scale") that is not a regression coefficient. The "scale parameter" is not part of the parameter estimates table produced by PROC REG

Read More

Learn SAS | Programming Tips

Rick WicklinFebruary 19, 2024 0

The linear distribution on an interval

In a recent Monte Carlo project, I needed to simulate numbers on an interval by using a continuous linear probability density function (PDF). An example is shown to the right. In this example, the linear density function is decreasing on the interval, but the function could also be constant or

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinFebruary 12, 2024 0

An exact formula for the sampling distribution of the correlation coefficient

I read a journal article in which a researcher used a formula for the probability density function (PDF) of the sample correlation coefficient. The formula was rather complicated, and presented with no citation, so I was curious to learn more. I found the distribution for the correlation coefficient in the

Read More

Analytics | Learn SAS

Rick WicklinFebruary 5, 2024 0

Peeling a convex hull

This article looks at a geometric method for estimating the center of a multivariate point cloud. The method is known as convex-hull peeling. In two-dimensions, you can perform convex-hull peeling in SAS 9 by using the CVEXHULL function in SAS IML software. For higher dimensions, you can use the CONVEXHULL

Read More

Analytics | Learn SAS

Rick WicklinJanuary 22, 2024 0

Angles vs slopes: The statistics of steepness

There are two popular ways to express the steepness of a line or ray. The most-often used mathematical definition is from high-school math where the slope is defined as "rise over run." A second way is to report the angle of inclination to the horizontal, as introduced in basic trigonometry.

Read More

Programming Tips

Rick WicklinJanuary 10, 2024 0

Blog posts from 2023 that deserve a second look

In a previous article, I presented some of the most popular blog posts from 2023. The popular articles tend to discuss elementary topics that have broad appeal. However, I also wrote many technical articles about advanced topics. The following articles didn't make the Top 10 list, but they deserve a

Read More

Analytics | Data Visualization | Learn SAS | Programming Tips

Rick WicklinJanuary 3, 2024 0

Top 10 posts from The DO Loop in 2023

In 2023, I wrote 90 articles for The DO Loop blog. My most popular articles were about SAS programming, data visualization, and statistics. In addition, several "general interest" articles were popular, including my article for Pi Day and an article about AI chatbots. If you missed any of these articles,

Read More

Analytics | Learn SAS

Rick WicklinDecember 19, 2023 0

The difference between frequencies and weights in a correlation analysis

Statistical software often includes supports for a weight variable. Many SAS procedures make a distinction between integer frequencies and more general "importance weights." Frequencies are supported by using the FREQ statement in SAS procedures; general weights are supported by using the WEIGHT statement. An exception is PROC FREQ, which contains

Read More

Analytics | Learn SAS

Rick WicklinDecember 13, 2023 0

Estimate polychoric correlation by maximum likelihood estimation

SAS provides many built-in routines for data analysis. A previous article discusses polychoric correlation, which is a measure of association between two ordinal variables. In SAS, you can use PROC FREQ or PROC CORR to estimate the polychoric correlation, its standard error, and confidence intervals. Although SAS provides a built-in

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinDecember 4, 2023 0

Bivariate normal probability in SAS

A previous article discussed how to compute probabilities for the bivariate standard normal distribution. The standard bivariate normal distribution with correlation ρ is denoted BVN(0,ρ). For any point (x,y), you can use the PROBBNRM function in SAS to compute the probability that the random variables (X,Y) ~ BVN(0,ρ) is observed

Read More

Analytics | Programming Tips

Rick WicklinNovember 29, 2023 0

Bivariate normal probability in SAS: Rectangular regions

This article shows how to use SAS to compute the probabilities for two correlated normal variables. Specifically, this article shows how to compute the probabilities for rectangular regions in the plane. A second article discusses the computation over infinite regions such as quadrants. If (X,Y) are random variables that are

Read More

Analytics | Data Visualization | Programming Tips

Rick WicklinNovember 27, 2023 0

An example of finite-precision issues in a simple collinearity algorithm

The collinearity problem is to determine whether three points in the plane lie along a straight line. You can solve this problem by using middle-school algebra. An algebraic solution requires three steps. First, name the points: p, q, and r. Second, find the parametric equation for the line that passes

Read More

Learn SAS | Programming Tips

Rick WicklinNovember 15, 2023 0

On resizing an array when an index is out of bounds

Converting a program from one language to another can be a challenge. Even if the languages share many features, there is often syntax that is valid in one language that is not valid in another. Recently, a SAS programmer was converting a program from R to SAS IML. He reached

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinNovember 6, 2023 0

Standard errors for maximum likelihood estimation

In several previous articles, I've shown how to use SAS to fit models to data by using maximum likelihood estimation (MLE). However, I have not previously shown how to obtain standard errors for the estimates. This article combines two previous articles to show how to obtain MLE estimates and the

Read More

Learn SAS | Programming Tips

Rick WicklinNovember 1, 2023 0

The distribution of the sample median for normal data

A previous article shows how to use Monte Carlo simulation to approximate the sampling distribution of the sample mean and sample median. When x ~ N(0,1) are normal data, the sample mean is also normal, and there are simple formulas for the expected value and the standard error of the

Read More

Learn SAS | Programming Tips

Rick WicklinOctober 30, 2023 0

The distribution of the sample median

An elementary course in statistics often includes a discussion of the sampling distribution of a statistic. The canonical example is the sampling distribution of the sample mean. For samples of size n that are drawn from a normally distribution (X ~ N(μ, σ)), the sample mean is normally distributed as

Read More

Learn SAS | Programming Tips

Rick WicklinOctober 25, 2023 0

Quantiles of the generalized birthday problem

A previous article discusses the birthday problem and its generalizations. The classic birthday problem asks, "In a room that contains N people, what is the probability that two or more people share a birthday?" The probability is much higher than you might think. For example, in a room that contains

Read More

Learn SAS | Programming Tips

Rick WicklinOctober 23, 2023 0

The generalized birthday problem

The birthday-matching problem (also called the birthday paradox or simply the birthday problem), is a classic problem in probability. Simply stated, the birthday-matching problem asks, "If there are N people in a room, what is the chance that two of them have the same birthday?" The problem is sometimes called

Read More

Learn SAS | Programming Tips

Rick WicklinOctober 9, 2023 0

Functions for continuous probability distributions in SAS

The documentation for Python's SciPy package provides a table that concisely summarizes functions that are associated with continuous probability distributions. This article provides a similar table for SAS functions. For more information on the CDF, PDF, quantile, and random-variate functions, see "Four essential functions for statistical programmers." SAS functions for

Read More

Learn SAS | Programming Tips

Rick WicklinSeptember 11, 2023 0

On the performance of BY-group processing in SAS IML

Many SAS procedures support a BY statement that enables you to perform an analysis for each unique value of a BY-group variable. The SAS IML language does not support a BY statement, but you can program a loop that iterates over all BY groups. You can emulate BY-group processing by

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinSeptember 6, 2023 0

Model data from published summary statistics

There are many ways to model a set of raw data by using a continuous probability distribution. It can be challenging, however, to choose the distribution that best models the data. Are the data normal? Lognormal? Is there a theoretical reason to prefer one distribution over another? The SAS has

Read More

Learn SAS | Programming Tips

Rick WicklinJuly 31, 2023 0

Create a probability distribution from almost any positive function

There are dozens of common probability distributions for a continuous univariate random variable. Familiar examples include the normal, exponential, uniform, gamma, and beta distributions. Where did these distributions come from? Well, some mathematician needed a model for a stochastic process and wrote down the equation for the distribution, typically by

Read More

Analytics | Programming Tips

Rick WicklinJuly 24, 2023 0

Modifications of the Wilcoxon signed rank test and exact p-values

In a previous article, I discussed the Wilcoxon signed rank test, which is a nonparametric test for the location of the median. The Wikipedia article about the signed rank test mentions a variation of the test due to Pratt (1959). Whereas the standard Wilcoxon test excludes values that equal μ0

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinJuly 19, 2023 0

On the computation of the Wilcoxon signed rank statistic

Wilcoxon's signed rank test is a popular nonparametric alternative to a paired t test. In a paired t test, you analyze measurements for subjects before and after some treatment or intervention. You analyze the difference in the measurements for each subject, and test whether the mean difference is significantly different

Read More

1 2 3 … 15 Next