Blogs

Blogs

Tag: Efficiency

Customer Intelligence | Work & Life at SAS

How transformation is shifting the COO’s role towards innovation

Juan MonteroNovember 16, 2022 0

How transformation is shifting the COO’s role towards innovation

The role of Chief Operating Officer (COO) is interesting. It’s a relatively new role—and there is also relatively little consensus about the job description. In some organisations, the COO is the de facto deputy CEO. They have their fingers in most pies, from operational to strategic initiatives. In others, the

Read More

Analytics | Artificial Intelligence

Caslee SimsNovember 9, 2022 0

3 principles that emphasize productivity for your analytics platform

More digital channels are bringing greater connectivity and more data is bringing added complexity to organizations. All this can feel chaotic or like a fog of information warfare. As a result, the pace of disruption and data expansion require visual tools that accelerate data wrangling and modeling. To overcome complexity,

Read More

Analytics | Learn SAS | Programming Tips

Rick WicklinSeptember 13, 2021 0

The partition problem

The partition problem has many variations, but recently I encountered it as an interactive puzzle on a computer. (Try a similar game yourself!) The player is presented with an old-fashioned pan-balance scale and a set of objects of different weights. The challenge is to divide (or partition) the objects into

Read More

Learn SAS | Programming Tips

Rick WicklinAugust 14, 2019 0

Short-circuit evaluation and logical ligatures in SAS

Many programmers are familiar with "short-circuit" evaluation in an IF-THEN statement. Short circuit means that a program does not evaluate the remainder of a logical expression if the value of the expression is already logically determined. The SAS DATA step supports short-circuiting for simple logical expressions in IF-THEN statements and

Read More

Learn SAS | Programming Tips

Rick WicklinMay 13, 2019 0

Write to a SAS data set from inside a SAS/IML loop

In SAS/IML programs, a common task is to write values in a matrix to a SAS data set. For some programs, the values you want to write are in a matrix and you use the CREATE FROM/APPEND FROM syntax to create the data set, as follows: proc iml; X =

Read More

Programming Tips

Rick WicklinOctober 8, 2018 0

The intersection of multiple sets

This article compares several ways to find the elements that are common to multiple sets. I test which method is the fastest in the SAS/IML language. However, all algorithms are intrinsically fast, which raises an important question: when is it worth the time and effort to optimize an algorithm? The

Read More

Programming Tips

Rick WicklinSeptember 26, 2018 0

Radial basis functions and Gaussian kernels in SAS

A radial basis function is a scalar function that depends on the distance to some point, called the center point, c. One popular radial basis function is the Gaussian kernel φ(x; c) = exp(-||x – c||2 / (2 σ2)), which uses the squared distance from a vector x to the

Read More

Programming Tips

Rick WicklinAugust 21, 2017 0

6 tips for timing the performance of algorithms

When you implement a statistical algorithm in a vector-matrix language such as SAS/IML, R, or MATLAB, you should measure the performance of your implementation, which means that you should time how long a program takes to analyze data of varying sizes and characteristics. There are some general tips that can

Read More

Learn SAS

Rick WicklinJuly 13, 2015 0

Compare the performance of algorithms in SAS

As my colleague Margaret Crevar recently wrote, it is useful to know how long SAS programs take to run. Margaret and others have written about how to use the SAS FULLSTIMER option to monitor the performance of the SAS system. In fact, SAS distributes a macro that enables you to

Read More

Rick WicklinMarch 18, 2015 0

Finding observations that match a target value

Imagine that you have one million rows of numerical data and you want to determine if a particular "target" value occurs. How might you find where the value occurs? For univariate data, this is an easy problem. In the SAS DATA step you can use a WHERE clause or a

Read More

Rick WicklinMarch 4, 2015 0

An easy way to approximate a cumulative distribution function

Evaluating a cumulative distribution function (CDF) can be an expensive operation. Each time you evaluate the CDF for a continuous probability distribution, the software has to perform a numerical integration. (Recall that the CDF at a point x is the integral under the probability density function (PDF) where x is

Read More

Rick WicklinFebruary 16, 2015 0

Friends don't let friends concatenate results inside a loop

Friends have to look out for each other. Sometimes this can be slightly embarrassing. At lunch you might need to tell a friend that he has some tomato sauce on his chin. Or that she has a little spinach stuck between her teeth. Or you might need to tell your

Read More

Learn SAS

Rick WicklinJanuary 20, 2015 0

Finding matrix elements that satisfy a logical expression

A common task in SAS/IML programming is finding elements of a SAS/IML matrix that satisfy a logical expression. For example, you might need to know which matrix elements are missing, are negative, or are divisible by 2. In the DATA step, you can use the WHERE clause to subset data.

Read More

Rick WicklinJuly 2, 2014 0

Pairwise comparisons of a data vector

A SAS customer showed me a SAS/IML program that he had obtained from a book. The program was taking a long time to run on his data, which was somewhat large. He was wondering if I could identify any inefficiencies in the program. The first thing I did was to

Read More

Rick WicklinJune 27, 2014 0

Simulate many samples from a logistic regression model

My last blog post showed how to simulate data for a logistic regression model with two continuous variables. To keep the discussion simple, I simulated a single sample with N observations. However, to obtain the sampling distribution of statistics, you need to generate many samples from the same logistic model.

Read More

Rick WicklinJune 25, 2014 0

Simulating data for a logistic regression model

In my book Simulating Data with SAS, I show how to use the SAS DATA step to simulate data from a logistic regression model. Recently there have been discussions on the SAS/IML Support Community about simulating logistic data by using the SAS/IML language. This article describes how to efficiently simulate

Read More

Rick WicklinMay 21, 2014 0

Using associativity can lead to big performance improvements in matrix multiplication

In a previous post, I stated that you should avoid matrix multiplication that involves a huge diagonal matrix because that operation can be carried out more efficiently. Here's another tip that sometimes improves the efficiency of matrix multiplication: use parentheses to prevent the creation of large matrices. Matrix multiplication is

Read More

Rick WicklinMay 19, 2014 0

Never multiply with a large diagonal matrix

I love working with SAS Technical Support because I get to see real problems that SAS customers face as they use SAS/IML software. The other day I advised a customer how to improve the efficiency of a computation that involved multiplying large matrices. In this article I describe an important

Read More

Learn SAS

Rick WicklinOctober 21, 2013 0

Assign the diagonal elements of a matrix

SAS/IML programmers know that the VECDIAG matrix can be used to extract the diagonal elements of a matrix. For example, the following statements extract the diagonal of a 3 x 3 matrix: proc iml; m = {1 2 3, 4 5 6, 7 8 9}; v = vecdiag(m); /* v = {1,5,9}

Read More

Advanced Analytics

Rick WicklinJune 24, 2013 0

Count the number of unique rows in a matrix

How do you count the number of unique rows in a matrix? The simplest algorithm is to sort the data and then iterate down the rows, comparing each row with the previous row. However, this algorithm has two shortcomings: it physically sorts the data (which means that the original locations

Read More

Advanced Analytics

Rick WicklinMay 30, 2013 0

Using simulation to estimate the power of a statistical test

The power of a statistical test measures the test's ability to detect a specific alternate hypothesis. For example, educational researchers might want to compare the mean scores of boys and girls on a standardized test. They plan to use the well-known two-sample t test. The null hypothesis is that the

Read More

Rick WicklinMay 15, 2013 0

How to vectorize computations in a matrix language

Last week someone posted an interesting question to the SAS/IML Support Community. The problem involved four nested DO loops and took hours to run. By transforming several nested DO loops into an equivalent matrix operation, I was able to reduce the run time to about one second. The process of

Read More

Learn SAS

Rick WicklinJanuary 30, 2013 0

Oh, those pesky temporary variables!

The SAS/IML language secretly creates temporary variables. Most of the time programmers aren't even aware that the language does this. However, there is one situation where if you don't think carefully about temporary variables, your program will silently produce an error. And as every programmer knows, silent wrong numbers are

Read More

Advanced Analytics

Rick WicklinJanuary 16, 2013 0

Generate binary outcomes with varying probability

A while ago I saw a blog post on how to simulate Bernoulli outcomes when the probability of generating a 1 (success) varies from observation to observation. I've done this often in SAS, both in the DATA step and in the SAS/IML language. For example, when simulating data that satisfied

Read More

Rick WicklinDecember 5, 2012 0

Remove or keep: Which is faster?

In a recent article on efficient simulation from a truncated distribution, I wrote some SAS/IML code that used the LOC function to find and exclude observations that satisfy some criterion. Some readers came up with an alternative algorithm that uses the REMOVE function instead of subscripts. I remarked in a

Read More

Rick WicklinNovember 21, 2012 0

Efficient acceptance-rejection simulation: Part II

Last week I wrote about using acceptance-rejection algorithms in vector languages to simulate data. The main point I made is that in a vector language it is efficient to generate many more variates than are needed, with the knowledge that a certain proportion will be rejected. In last week's article,

Read More

Rick WicklinNovember 14, 2012 0

Efficient acceptance-rejection simulation

A few days ago on the SAS/IML Support Community, there was an interesting discussion about how to simulate data from a truncated Poisson distribution. The SAS/IML user wanted to generate values from a Poisson distribution, but discard any zeros that are generated. This kind of simulation is known as an

Read More

Advanced Analytics

Rick WicklinNovember 7, 2012 0

Constructing block matrices with applications to mixed models

The other day I was constructing covariance matrices for simulating data for a mixed model with repeated measurements. I was using the SAS/IML BLOCK function to build up the "R-side" covariance matrix from smaller blocks. The matrix I was constructing was block-diagonal and looked like this: The matrix represents a

Read More

Rick WicklinJuly 25, 2012 0

Using macro loops for simulation

Last week I wrote an article in which I pointed out that many SAS programmers write a simulation in SAS by writing a macro loop. This approach is extremely inefficient, so I presented a more efficient technique. Not only is the macro loop approach slow, but there are other undesirable

Read More

Rick WicklinJuly 18, 2012 0

Simulation in SAS: The slow way or the BY way

Over the past few years, and especially since I posted my article on eight tips to make your simulation run faster, I have received many emails (often with attached SAS programs) from SAS users who ask for advice about how to speed up their simulation code. For this reason, I

Read More

1 2 Next