## Tag: vectorization

0
Logical negation of vectors

In a matrix-vector language such as SAS/IML, it is useful to always remember that the fundamental objects are matrices and that all operations are designed to work on matrices. (And vectors, which are matrices that have only one row or one column.) By using matrix operations, you can often eliminate

0
Generate random points in a triangle

How can you efficiently generate N random uniform points in a triangular region of the plane? There is a very cool algorithm (which I call the reflection method) that makes the process easy. I no longer remember where I saw this algorithm, but it is different from the "weighted average"

0
Vectorize the computation of the Mandelbrot set in a matrix language

When my colleague, Robert Allison, blogged about visualizing the Mandelbrot set, I was reminded of a story from the 1980s, which was the height of the fractal craze. A research group in computational mathematics had been awarded a multimillion-dollar grant to purchase a supercomputer. When the supercomputer arrived and got

0
The Theil-Sen robust estimator for simple linear regression

Modern statistical software provides many options for computing robust statistics. For example, SAS can compute robust univariate statistics by using PROC UNIVARIATE, robust linear regression by using PROC ROBUSTREG, and robust multivariate statistics such as robust principal component analysis. Much of the research on robust regression was conducted in the

Programming Tips
0
The CUSUM test for randomness of a binary sequence

Many statistical tests use a CUSUM statistic as part of the test. It can be confusing when a researcher refers to "the CUSUM test" without providing details about exactly which CUSUM test is being used. This article describes a CUSUM test for the randomness of a binary sequence. You start

0
Use the FLOOR-MOD trick to allocate items to groups

Suppose you need to assign 100 patients equally among 3 treatment groups in a clinical study. Obviously, an equal allocation is impossible because the second number does not evenly divide the first, but you can get close by assigning 34 patients to one group and 33 to the others. Mathematically,

0
The chi-square test: An example of working with rows and columns in SAS

As a general rule, when SAS programmers want to manipulate data row by row, they reach for the SAS DATA step. When the computation requires column statistics, the SQL procedure is also useful. When both row and column operations are required, the SAS/IML language is a powerful addition to a

0
Jackknife estimates in SAS

One way to assess the precision of a statistic (a point estimate) is to compute the standard error, which is the standard deviation of the statistic's sampling distribution. A relatively large standard error indicates that the point estimate should be viewed with skepticism, either because the sample size is small

Programming Tips
0
A simple trick to construct symmetric intervals

Many intervals in statistics have the form p ± δ, where p is a point estimate and δ is the radius (or half-width) of the interval. (For example, many two-sided confidence intervals have this form, where δ is proportional to the standard error.) Many years ago I wrote an article

Programming Tips
0
Find a pattern in a sequence of digits

I recently needed to solve a fun programming problem. I challenge other SAS programmers to solve it, too! The problem is easy to state: Given a long sequence of digits, can you write a program to count how many times a particular subsequence occurs? For example, if I give you

Learn SAS
0
Break a sentence into words in SAS

Two of my favorite string-manipulation functions in the SAS DATA step are the COUNTW function and the SCAN function. The COUNTW function counts the number of words in a long string of text. Here "word" means a substring that is delimited by special characters, such as a space character, a

0
Visualize the Cantor function in SAS

I was a freshman in college the first time I saw the Cantor middle-thirds set and the related Cantor "Devil's staircase" function. (Shown at left.) These constructions expanded my mind and led me to study fractals, real analysis, topology, and other mathematical areas. The Cantor function and the Cantor middle-thirds

0
Rolling statistics in SAS/IML

Last week I showed how to use PROC EXPAND to compute moving averages and other rolling statistics in SAS. Unfortunately, PROC EXPAND is part of SAS/ETS software and not every SAS site has a license for SAS/ETS. For simple moving averages, you can write a DATA step program, as discussed

0
Compute the centroid of a polygon in SAS

Recently I blogged about how to compute a weighted mean and showed that you can use a weighted mean to compute the center of mass for a system of N point masses in the plane. That led me to think about a related problem: computing the center of mass (called

0
The CUSUM-LAG trick in SAS/IML

Every year near Halloween I write a trick-and-treat article in which I demonstrate a simple programming trick that is a real treat to use. This year's trick features two of my favorite functions, the CUSUM function and the LAG function. By using these function, you can compute the rows of

0
Balls and urns Part 2: Multi-colored balls

In a previous post I described how to simulate random samples from an urn that contains colored balls. The previous article described the case where the balls can be either of two colors. In that csae, all the distributions are univariate. In this article I examine the case where the

0
Sum a series in SAS

A customer asked: How do we go about summing a finite series in SAS? For example, I want to compute for various integers n ≥ 3. I want to output two columns, one for the natural numbers and one for the summation of the series. Summations arise often in statistical

Learn SAS
0
Writing data in chunks: Does the chunk size matter?

I often blog about the usefulness of vectorization in the SAS/IML language. A one-sentence summary of vectorization is "execute a small number of statements that each analyze a lot of data." In general, for matrix languages (SAS/IML, MATLAB, R, ...) vectorization is more efficient than the alternative, which is to

Learn SAS
0
Avoid loops, avoid the APPLY function, vectorize!

Last week I received a message from SAS Technical Support saying that a customer's IML program was running slowly. Could I look at it to see whether it could be improved? What I discovered is a good reminder about the importance of vectorizing user-defined modules. The program in this blog

0
Elementwise minimum and maximum operators

Like most programming languages, the SAS/IML language has many functions. However, the SAS/IML language also has quite a few operators. Operators can act on a matrix or on rows or columns of a matrix. They are less intuitive, but can be quite powerful because they enable you perform computations without

0
Resampling and permutation tests in SAS

My colleagues at the SAS & R blog recently posted an example of how to program a permutation test in SAS and R. Their SAS implementation used Base SAS and was "relatively cumbersome" (their words) when compared with the R code. In today's post I implement the permutation test in

0
Wolfram's Rule 30 in SAS

My previous blog post describes how to implement Conway's Game of Life by using the dynamically linked graphics in SAS/IML Studio. But the Game of Life is not the only kind of cellular automata. This article describes a system of cellular automata that is known as Wolfram's Rule 30. In

0
Pairwise comparisons of a data vector

A SAS customer showed me a SAS/IML program that he had obtained from a book. The program was taking a long time to run on his data, which was somewhat large. He was wondering if I could identify any inefficiencies in the program. The first thing I did was to

0
How to find an initial guess for an optimization

Nonlinear optimization routines enable you to find the values of variables that optimize an objective function of those variables. When you use a numerical optimization routine, you need to provide an initial guess, often called a "starting point" for the algorithm. Optimization routines iteratively improve the initial guess in an

0
Permute elements within each row of a matrix

Bootstrap methods and permutation tests are popular and powerful nonparametric methods for testing hypotheses and approximating the sampling distribution of a statistic. I have described a SAS/IML implementation of a bootstrap permutation test for matched pairs of data (an alternative to a matched-pair t test) in my paper "Modern Data

0
The inverse of the Hilbert matrix

Just one last short article about properties of the Hilbert matrix. I've already blogged about how to construct a Hilbert matrix in the SAS/IML language and how to compute a formula for the determinant. One reason that the Hilbert matrix is a famous (some would say infamous!) example in numerical

Learn SAS
0
The Hilbert matrix: A vectorized construction

The Hilbert matrix is the most famous ill-conditioned matrix in numerical linear algebra. It is often used in matrix computations to illustrate problems that arise when you compute with ill-conditioned matrices. The Hilbert matrix is symmetric and positive definite, properties that are often associated with "nice" and "tame" matrices. The

Learn SAS
0
How to vectorize time series computations

Vector languages such as SAS/IML, MATLAB, and R are powerful because they enable you to use high-level matrix operations (matrix multiplication, dot products, etc) rather than loops that perform scalar operations. In general, vectorized programs are more efficient (and therefore run faster) than programs that contain loops. For an example

Learn SAS
0
Vectorizing the construction of a structured matrix

In using a vector-matrix language such as SAS/IML, MATLAB, or R, one of the challenges for programmers is learning how to vectorize computations. Often it is not intuitive how to program a computation so that you avoid looping over the rows and columns of a matrix. However, there are a

Learn SAS
0
Assign the diagonal elements of a matrix

SAS/IML programmers know that the VECDIAG matrix can be used to extract the diagonal elements of a matrix. For example, the following statements extract the diagonal of a 3 x 3 matrix: proc iml; m = {1 2 3, 4 5 6, 7 8 9}; v = vecdiag(m); /* v = {1,5,9}