Blogs

Blogs

Tag: Statistical Programming

Analytics | Learn SAS | Programming Tips

Rick WicklinOctober 24, 2022 0

Implement binary logistic regression from first principles

I recently gave a presentation about the SAS/IML matrix language in which I emphasized that a matrix language enables you to write complex analyses by using only a few lines of code. In the presentation, I used least squares regression as an example. One participant asked how many additional lines

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinOctober 3, 2022 0

Compute moments of probability distributions in SAS

A previous article discusses the definitions of three kinds of moments for a continuous probability distribution: raw moments, central moments, and standardized moments. These are defined in terms of integrals over the support of the distribution. Moments are connected to the familiar shape features of a distribution: the mean, variance,

Read More

Analytics | Learn SAS

Rick WicklinSeptember 21, 2022 0

The noncentral t distribution in SAS

The noncentral t distribution is a probability distribution that is used in power analysis and hypothesis testing. The distribution generalizes the Student t distribution by adding a noncentrality parameter, δ. When δ=0, the noncentral t distribution is the usual (central) t distribution, which is a symmetric distribution. When δ >

Read More

Learn SAS | Programming Tips

Rick WicklinSeptember 12, 2022 0

Convert integers from base 10 to another base

An integer can be represented in many ways. This article shows how to represent a positive integer in any base b. The most common base is b=10, but other popular bases are b=2 (binary numbers), b=8 (octal), and b=16 (hexadecimal). Each base represents integers in different ways. Think of a

Read More

Learn SAS | Programming Tips

Rick WicklinAugust 31, 2022 0

Two types of syntax for the SELECT-WHEN statement in SAS

The SELECT-WHEN statement in the SAS DATA step is an alternative to using a long sequence of IF-THEN/ELSE statements. Although logically equivalent to IF-THEN/ELSE statements, the SELECT-WHEN statement can be easier to read. This article discusses the two distinct ways to specify the SELECT-WHEN statement. You can use the first

Read More

Learn SAS | Programming Tips

Rick WicklinJuly 20, 2022 0

A tip for visualizing the density function of a distribution

It isn't easy to draw the graph of a function when you don't know what the graph looks like. To draw the graph by using a computer, you need to know the domain of the function for the graph: the minimum value (xMin) and the maximum value (xMax) for plotting

Read More

Learn SAS | Programming Tips

Rick WicklinJuly 18, 2022 0

Tips for computing right-tail probabilities and quantiles

A colleague was struggling to compute a right-tail probability for a distribution. Recall that the cumulative distribution function (CDF) is defined as a left-tail probability. For a continuous random variable, X, with density function f, the CDF at the value x is F(x) = Pr(X ≤ x) = ∫

Read More

Analytics | Learn SAS

Rick WicklinJuly 11, 2022 0

Confidence bands for partial leverage regression plots

I previously wrote about partial leverage plots for regression diagnostics and why they are useful. You can generate a partial leverage plot in SAS by using the PLOTS=PARTIALPLOT option in PROC REG. One useful property of partial leverage plots is the ability to graphically represent the null hypothesis that a

Read More

Analytics | Data Visualization | Programming Tips

Rick WicklinJune 29, 2022 0

Compute the multivariate t density function

A previous article shows how to compute the probability density function (PDF) for the multivariate normal distribution. In a similar way, you can compute the density function for the multivariate t distribution. This article discusses the density function for the multivariate t distribution, shows how to compute it, and visualizes

Read More

Analytics | Programming Tips

Rick WicklinJune 27, 2022 0

The derivative of a quantile function

Recently, I needed to solve an optimization problem in which the objective function included a term that involved the quantile function (inverse CDF) of the t distribution, which is shown to the right for DF=5 degrees of freedom. I casually remarked to my colleague that the optimizer would have to

Read More

Programming Tips

Rick WicklinJune 1, 2022 0

Random assignment of subjects to groups in SAS

A common question on SAS discussion forums is how to randomly assign observations to groups. An application of this problem is assigning patients to cohorts in a clinical trial. For example, you might have 137 patients that you want to randomly assign to three groups: a control group, a group

Read More

Learn SAS | Programming Tips

Rick WicklinMay 16, 2022 0

How to unroll frequency data

In categorical data analysis, it is common to analyze tables of counts. For example, a researcher might gather data for 18 boys and 12 girls who apply for a summer enrichment program. The researcher might be interested in whether the proportion of boys that are admitted is different from the

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinMay 2, 2022 0

Simulate the null distribution for a hypothesis test

Recently, I wrote about Bartlett's test for sphericity. The purpose of this hypothesis test is to determine whether the variables in the data are uncorrelated. It works by testing whether the sample correlation matrix is close to the identity matrix. Often statistics textbooks or articles include a statement such as

Read More

Data Visualization | Learn SAS | Programming Tips

Rick WicklinApril 20, 2022 0

Use a heat map to visualize an ordinal response in longitudinal data

Recently, I showed how to use a heat map to visualize measurements over time for a set of patients in a longitudinal study. The visualization is sometimes called a lasagna plot because it presents an alternative to the usual spaghetti plot. A reader asked whether a similar visualization can be

Read More

Analytics | Learn SAS

Rick WicklinApril 18, 2022 0

The McNemar test in SAS

What is McNemar's test? How do you run the McNemar test in SAS? Why might other statistical software report a value for McNemar's test that is different from the SAS value? SAS supports an exact version of the McNemar test, but when should you use it? This article answers these

Read More

Learn SAS | Programming Tips

Rick WicklinMarch 23, 2022 0

Compute properties of discrete probability distributions

This article shows how to compute properties of a discrete probability distribution from basic definitions. You can use the definitions to compute the mean, variance, and median of a discrete probability distribution when there is no simple formula for those quantities. This article is motivated by two computational questions about

Read More

Learn SAS | Programming Tips

Rick WicklinMarch 21, 2022 0

Five constants every statistical programmer should know

Statistical programmers need to access numerical constants that help us to write robust and accurate programs. Specifically, it is necessary to know when it is safe to perform numerical operations such as raising a number to a power without exceeding the largest number that is representable in finite-precision arithmetic. This

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinMarch 16, 2022 0

Least-squares optimization and the Gauss-Newton method

A previous article showed how to use SAS to compute finite-difference derivatives of smooth vector-valued multivariate functions. The article uses the NLPFDD subroutine in SAS/IML to compute the finite-difference derivatives. The article states that the third output argument of the NLPFDD subroutine "contains the matrix product J`*J, where J is

Read More

Analytics | Programming Tips

Rick WicklinMarch 7, 2022 0

Finite-difference derivatives of vector-valued functions

I previously showed how to use SAS to compute finite-difference derivatives for smooth scalar-valued functions of several variables. You can use the NLPFDD subroutine in SAS/IML software to approximate the gradient vector (first derivatives) and the Hessian matrix (second derivatives). The computation uses finite-difference derivatives to approximate the derivatives. The

Read More

Analytics | Programming Tips

Rick WicklinMarch 2, 2022 0

Finite-difference derivatives in SAS

Many applications in mathematics and statistics require the numerical computation of the derivatives of smooth multivariate functions. For simple algebraic and trigonometric functions, you often can write down expressions for the first and second partial derivatives. However, for complicated functions, the formulas can get unwieldy (and some applications do not

Read More

Analytics | Programming Tips

Rick WicklinFebruary 14, 2022 0

Passing-Bablok regression in SAS

This article implements Passing-Bablok regression in SAS. Passing-Bablok regression is a one-variable regression technique that is used to compare measurements from different instruments or medical devices. The measurements of the two variables (X and Y) are both measured with errors. Consequently, you cannot use ordinary linear regression, which assumes that

Read More

Learn SAS | Programming Tips

Rick WicklinFebruary 9, 2022 0

Billiards on a heart-shaped table

For some reason, SAS programmers like to express their love by writing SAS programs. Since Valentine's Day is next week, I thought I would add another SAS graphic to the collection of ways to use SAS to express your love. Last week, I showed how to use vector operation and

Read More

Learn SAS | Programming Tips

Rick WicklinFebruary 7, 2022 0

Billiards on an elliptical table

I recently showed how to find the intersection between a line and a circle. While working on the problem, I was reminded of a fun mathematical game. Suppose you make a billiard table in the shape of a circle or an ellipse. What is the path for a ball at

Read More

Analytics | Programming Tips

Rick WicklinFebruary 2, 2022 0

Implement a line search algorithm in SAS

Recently, I needed to implement a line search algorithm in SAS. The line search is illustrated by the figure at the right. You start with a point, p, in d-dimensional space and a direction vector, v. (In the figure, d=2, but in general d > 1.) The goal is to

Read More

Analytics | Programming Tips

Rick WicklinJanuary 31, 2022 0

The ERF and ERFC functions for statisticians

Recently, a SAS programmer commented about one of my blog posts. He said that he had found an alternative answer on another website. Whereas my answer was formulated in terms of the normal cumulative distribution function (CDF), the other answer used the ERF function. This article shows the relationship between

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinJanuary 10, 2022 0

12 blog posts from 2021 that deserve a second look

On this blog, I write about a diverse set of topics that are relevant to statistical programming and data visualization. In a previous article, I presented some of the most popular blog posts from 2021. The most popular articles often deal with elementary or familiar topics that are useful to

Read More

Analytics | Data Visualization | Programming Tips

Rick WicklinJanuary 3, 2022 0

Top 10 posts from The DO Loop in 2021

Last year, I wrote almost 100 posts for The DO Loop blog. My most popular articles were about data visualization, statistics and data analysis, and simulation and bootstrapping. If you missed any of these gems when they were first published, here are some of the most popular articles from 2021:

Read More

Analytics | Data Visualization

Rick WicklinDecember 13, 2021 0

A principal component analysis of color palettes

In a previous article, I visualized seven Christmas-themed palettes of colors, as shown to the right. You can see that the palettes include many red, green, and golden colors. Clearly, the colors in the Christmas palettes are not a random sample from the space of RGB colors. Rather, they represent

Read More

Analytics | Programming Tips

Rick WicklinDecember 6, 2021 0

The expected number of points on a convex hull

While discussing how to compute convex hulls in SAS with a colleague, we wondered how the size of the convex hull compares to the size of the sample. For most distributions of points, I claimed that the size of the convex hull is much less than the size of the

Read More

Learn SAS

Rick WicklinNovember 22, 2021 0

What is a CAS-enabled procedure?

I attended a seminar last week whose purpose was to inform SAS 9 programmers about SAS Viya. I could tell from the programmer's questions that some programmers were confused about three basic topics: What are the computing environments in Viya, and how should a programmer think about them? What procedures

Read More

Previous 1 2 3 4 5 … 15 Next