Blogs

Blogs

Tag: vectorization

Learn SAS | Programming Tips

Rick WicklinNovember 4, 2024 0

Levy flight and vectorizing a simulation in SAS

A previous article shows a simulation of two different models of a foraging animal. The first model is a random walk, which assumes that the animal chooses a random direction, then takes a step that is distributed according to a Gaussian random variable. In the second model, the animal again

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinNovember 2, 2022 0

The area and perimeter of a convex hull

The area of a convex hull enables you to estimate the area of a compact region from a set of discrete observations. For example, a biologist might have multiple sightings of a wolf pack and want to use the convex hull to estimate the area of the wolves' territory. A

Read More

Learn SAS | Programming Tips

Rick WicklinOctober 6, 2021 0

Logical negation of vectors

In a matrix-vector language such as SAS/IML, it is useful to always remember that the fundamental objects are matrices and that all operations are designed to work on matrices. (And vectors, which are matrices that have only one row or one column.) By using matrix operations, you can often eliminate

Read More

Analytics | Programming Tips

Random uniform points in a triangle

Rick WicklinOctober 19, 2020 0

Generate random points in a triangle

How can you efficiently generate N random uniform points in a triangular region of the plane? There is a very cool algorithm (which I call the reflection method) that makes the process easy. I no longer remember where I saw this algorithm, but it is different from the "weighted average"

Read More

Learn SAS | Programming Tips

Rick WicklinJuly 29, 2019 0

Vectorize the computation of the Mandelbrot set in a matrix language

When my colleague, Robert Allison, blogged about visualizing the Mandelbrot set, I was reminded of a story from the 1980s, which was the height of the fractal craze. A research group in computational mathematics had been awarded a multimillion-dollar grant to purchase a supercomputer. When the supercomputer arrived and got

Read More

Analytics | Programming Tips

Rick WicklinMay 28, 2019 0

The Theil-Sen robust estimator for simple linear regression

Modern statistical software provides many options for computing robust statistics. For example, SAS can compute robust univariate statistics by using PROC UNIVARIATE, robust linear regression by using PROC ROBUSTREG, and robust multivariate statistics such as robust principal component analysis. Much of the research on robust regression was conducted in the

Read More

Programming Tips

Rick WicklinApril 22, 2019 0

The CUSUM test for randomness of a binary sequence

Many statistical tests use a CUSUM statistic as part of the test. It can be confusing when a researcher refers to "the CUSUM test" without providing details about exactly which CUSUM test is being used. This article describes a CUSUM test for the randomness of a binary sequence. You start

Read More

Learn SAS | Programming Tips

Rick WicklinApril 8, 2019 0

Use the FLOOR-MOD trick to allocate items to groups

Suppose you need to assign 100 patients equally among 3 treatment groups in a clinical study. Obviously, an equal allocation is impossible because the second number does not evenly divide the first, but you can get close by assigning 34 patients to one group and 33 to the others. Mathematically,

Read More

Learn SAS | Programming Tips

Rick WicklinApril 2, 2018 0

The chi-square test: An example of working with rows and columns in SAS

As a general rule, when SAS programmers want to manipulate data row by row, they reach for the SAS DATA step. When the computation requires column statistics, the SQL procedure is also useful. When both row and column operations are required, the SAS/IML language is a powerful addition to a

Read More

Advanced Analytics | Learn SAS | Programming Tips

Rick WicklinJune 21, 2017 0

Jackknife estimates in SAS

One way to assess the precision of a statistic (a point estimate) is to compute the standard error, which is the standard deviation of the statistic's sampling distribution. A relatively large standard error indicates that the point estimate should be viewed with skepticism, either because the sample size is small

Read More

Programming Tips

Illustration of the 68-95-99.7 rule

Rick WicklinApril 10, 2017 0

A simple trick to construct symmetric intervals

Many intervals in statistics have the form p ± δ, where p is a point estimate and δ is the radius (or half-width) of the interval. (For example, many two-sided confidence intervals have this form, where δ is proportional to the standard error.) Many years ago I wrote an article

Read More

Programming Tips

Rick WicklinMarch 10, 2017 0

Find a pattern in a sequence of digits

I recently needed to solve a fun programming problem. I challenge other SAS programmers to solve it, too! The problem is easy to state: Given a long sequence of digits, can you write a program to count how many times a particular subsequence occurs? For example, if I give you

Read More

Learn SAS

Rick WicklinJuly 11, 2016 0

Break a sentence into words in SAS

Two of my favorite string-manipulation functions in the SAS DATA step are the COUNTW function and the SCAN function. The COUNTW function counts the number of words in a long string of text. Here "word" means a substring that is delimited by special characters, such as a space character, a

Read More

Rick WicklinJune 29, 2016 0

Visualize the Cantor function in SAS

I was a freshman in college the first time I saw the Cantor middle-thirds set and the related Cantor "Devil's staircase" function. (Shown at left.) These constructions expanded my mind and led me to study fractals, real analysis, topology, and other mathematical areas. The Cantor function and the Cantor middle-thirds

Read More

Advanced Analytics

Rick WicklinFebruary 3, 2016 0

Rolling statistics in SAS/IML

Last week I showed how to use PROC EXPAND to compute moving averages and other rolling statistics in SAS. Unfortunately, PROC EXPAND is part of SAS/ETS software and not every SAS site has a license for SAS/ETS. For simple moving averages, you can write a DATA step program, as discussed

Read More

Rick WicklinJanuary 13, 2016 0

Compute the centroid of a polygon in SAS

Recently I blogged about how to compute a weighted mean and showed that you can use a weighted mean to compute the center of mass for a system of N point masses in the plane. That led me to think about a related problem: computing the center of mass (called

Read More

Rick WicklinOctober 30, 2015 0

The CUSUM-LAG trick in SAS/IML

Every year near Halloween I write a trick-and-treat article in which I demonstrate a simple programming trick that is a real treat to use. This year's trick features two of my favorite functions, the CUSUM function and the LAG function. By using these function, you can compute the rows of

Read More

Rick WicklinOctober 2, 2015 0

Balls and urns Part 2: Multi-colored balls

In a previous post I described how to simulate random samples from an urn that contains colored balls. The previous article described the case where the balls can be either of two colors. In that csae, all the distributions are univariate. In this article I examine the case where the

Read More

Advanced Analytics | Learn SAS

Rick WicklinApril 22, 2015 0

Sum a series in SAS

A customer asked: How do we go about summing a finite series in SAS? For example, I want to compute for various integers n ≥ 3. I want to output two columns, one for the natural numbers and one for the summation of the series. Summations arise often in statistical

Read More

Learn SAS

Rick WicklinMarch 9, 2015 0

Writing data in chunks: Does the chunk size matter?

I often blog about the usefulness of vectorization in the SAS/IML language. A one-sentence summary of vectorization is "execute a small number of statements that each analyze a lot of data." In general, for matrix languages (SAS/IML, MATLAB, R, ...) vectorization is more efficient than the alternative, which is to

Read More

Learn SAS

Rick WicklinMarch 2, 2015 0

Avoid loops, avoid the APPLY function, vectorize!

Last week I received a message from SAS Technical Support saying that a customer's IML program was running slowly. Could I look at it to see whether it could be improved? What I discovered is a good reminder about the importance of vectorizing user-defined modules. The program in this blog

Read More

Advanced Analytics | Learn SAS

Rick WicklinDecember 15, 2014 0

Elementwise minimum and maximum operators

Like most programming languages, the SAS/IML language has many functions. However, the SAS/IML language also has quite a few operators. Operators can act on a matrix or on rows or columns of a matrix. They are less intuitive, but can be quite powerful because they enable you perform computations without

Read More

Rick WicklinNovember 21, 2014 0

Resampling and permutation tests in SAS

My colleagues at the SAS & R blog recently posted an example of how to program a permutation test in SAS and R. Their SAS implementation used Base SAS and was "relatively cumbersome" (their words) when compared with the R code. In today's post I implement the permutation test in

Read More

Rick WicklinOctober 17, 2014 0

Wolfram's Rule 30 in SAS

My previous blog post describes how to implement Conway's Game of Life by using the dynamically linked graphics in SAS/IML Studio. But the Game of Life is not the only kind of cellular automata. This article describes a system of cellular automata that is known as Wolfram's Rule 30. In

Read More

Rick WicklinJuly 2, 2014 0

Pairwise comparisons of a data vector

A SAS customer showed me a SAS/IML program that he had obtained from a book. The program was taking a long time to run on his data, which was somewhat large. He was wondering if I could identify any inefficiencies in the program. The first thing I did was to

Read More

Rick WicklinJune 11, 2014 0

How to find an initial guess for an optimization

Nonlinear optimization routines enable you to find the values of variables that optimize an objective function of those variables. When you use a numerical optimization routine, you need to provide an initial guess, often called a "starting point" for the algorithm. Optimization routines iteratively improve the initial guess in an

Read More

Rick WicklinMay 29, 2014 0

Permute elements within each row of a matrix

Bootstrap methods and permutation tests are popular and powerful nonparametric methods for testing hypotheses and approximating the sampling distribution of a statistic. I have described a SAS/IML implementation of a bootstrap permutation test for matched pairs of data (an alternative to a matched-pair t test) in my paper "Modern Data

Read More

Rick WicklinApril 23, 2014 0

The inverse of the Hilbert matrix

Just one last short article about properties of the Hilbert matrix. I've already blogged about how to construct a Hilbert matrix in the SAS/IML language and how to compute a formula for the determinant. One reason that the Hilbert matrix is a famous (some would say infamous!) example in numerical

Read More

Learn SAS

Rick WicklinApril 9, 2014 0

The Hilbert matrix: A vectorized construction

The Hilbert matrix is the most famous ill-conditioned matrix in numerical linear algebra. It is often used in matrix computations to illustrate problems that arise when you compute with ill-conditioned matrices. The Hilbert matrix is symmetric and positive definite, properties that are often associated with "nice" and "tame" matrices. The

Read More

Learn SAS

Rick WicklinJanuary 13, 2014 0

How to vectorize time series computations

Vector languages such as SAS/IML, MATLAB, and R are powerful because they enable you to use high-level matrix operations (matrix multiplication, dot products, etc) rather than loops that perform scalar operations. In general, vectorized programs are more efficient (and therefore run faster) than programs that contain loops. For an example

Read More