Move beyond spreadsheets to data mining, forecasting, optimization – and more

0
Sampling with replacement

Sampling with replacement is a useful technique for simulations and for resampling from data. Over at the SAS/IML Discussion Forum, there was a recent question about how to use SAS/IML software to sample with replacement from a set of events. I have previously blogged about efficient sampling, but this topic

0
SAS/IML software featured at WUSS

Today I'm in San Diego at the 2010 meeting of the Western Users of SAS Software (WUSS). I am giving several presentations on SAS/IML and SAS/IML Studio: A tutorial workshop on SAS/IML Studio for the SAS/STAT User. The material in this tutorial is a small sampling of Chapters 4–11 of

0
Tips and techniques - What’s the difference?

In this blog and in the book Statistical Programming with SAS/IML Software, I present tips and techniques for writing efficient SAS/IML programs for data analysis, simulation, matrix computations, and other topics of interest to statistical programmers. When I was writing my book, one of the reviewers commented that he wasn’t

0
Mistakes in the Forecasting Hierarchy

Many forecasting software packages support hierarchical forecasting. You define the hierarchical relationship of your products and locations, create forecasts at one or more levels, and then reconcile the forecasts across the full hierarchy. In a top-down approach, you generate forecasts at the highest level and apportion it down to lower

0
Tricks and Treats

How can you change a programming trick into a programming treat? Try this algorithm: If you develop a clever snippet of code, squirrel it away. This snippet is a "trick." If you use the trick a second time, copy and modify the code. The trick has become a "treat." If

0
Evaluate an iterated integral

The SAS/IML language provides the QUAD function for evaluating one-dimensional integrals. You can also use the QUAD function to compute a double integral as an iterated integral. A One-Dimensional Integration Suppose you want to evaluate the following integral: To evaluate this integral in the SAS/IML language: Define a function module

0
Creating a tridiagonal matrix

I was recently asked how to create a tridiagonal matrix in SAS/IML software. For example, how can you easily specify the following symmetric tridiagonal matrix without typing all of the zeros? proc iml; m = {1 6 0 0 0, 6 2 7 0 0, 0 7 3 8 0,

0
Looping versus LOC-ing revisited

In a previous post, I discussed how to use the LOC function to eliminate loops over observations. Dale McLerran chimed in to remind me that another way to improve efficiency is to use subscript reduction operators. I ended my previous post by issuing a challenge: can you write an efficient

0

Today is World Statistics Day, an event set up to "highlight the role of official statistics and the many achievements of the national statistical system." I want to commemorate World Statistics Day by celebrating the role of the US government in data collection and dissemination. Data analysis begins with data.

0
Celebrating World Statistics Day

Perhaps the toughest time in anyone's life is when you have to put away a loved one because they've been possessed by the devil. Other than that, though, I've had a good week*. And my week turns even better today, as we all join hands to celebrate World Statistics Day.

0
What is IMLPlus?

The IMLPlus language has been available to SAS customers since 2002, but there are still many people who have never heard of it. What is IMLPlus? The documentation SAS/IML Studio for SAS/STAT Users says this about IMLPlus: The programming language in SAS/IML Studio, which is called IMLPlus, is an enhanced

0
Solving scrambled-word puzzles

Have you ever been stuck while trying to solve a scrambled-word puzzle? You stare and stare at the letters, but no word reveals itself? You are stumped. Stymied. I hope you didn't get stumped on the word puzzle I posted as an anniversary present for my wife. She breezed through

0

A few people asked me to explain the significance of the cartoon in the scrambled-word puzzle that I posted as an anniversary present for my wife. The cartoon refers to a famous experiment devised by Sir Ronald A. Fisher.

0
Generate all permutations in SAS

In a previous post, I discussed how to generate random permutations of N elements. But what if you want to systematically iterate through a list of ALL permutations of N elements? For this, I like to use the ALLPERM subroutine in the SAS DATA step. [Editor's Note 13AUG2011: In SAS

0
A statistical word puzzle!

Today's post is a puzzle. Why? Well, my wife loves solving word puzzles, and today is our wedding anniversary. Last year, I bought her a Jumble® book. This year, I've created a one-of-a-kind scrambled word puzzle just for her. (But you can play, too!) I created this puzzle by using

0
How can you reshape a matrix?

Sometimes it is convenient to reshape the data in a matrix. Suppose you have a 1 x 12 matrix. This same data can fit into several matrices with different dimensions: a 2 x 6 matrix, a 3 x 4 matrix, a 4 x 3 matrix, and so on. The SHAPE function enables you to specify the number of

0

A little off the topic, but can anyone explain the theory of password security to me? Specifically, how does requiring me to periodically change my password improve security? Like most of you, on some of my online accounts I am reminded every few months that I must change the password.

0
Scrambling (and unscrambling) words

My previous post on creating a random permutation started me thinking about word games. My wife loves to solve the daily Jumble® puzzle that runs in our local paper. The puzzle displays a string of letters like MLYBOS, and you attempt to unscramble the letters to make an ordinary word.

0
Generating random permutations

I recently read a paper that described a SAS macro to carry out a permutation test. The permutations were generated by PROC IML. (In fact, an internet search for the terms "SAS/IML" and "permutation test" gives dozens of papers in recent years.) The PROC IML code was not as efficient

0
Matrices, eigenvalues, Fibonacci, and the golden ratio

A previous post described a simple algorithm for generating Fibonacci numbers. It was noted that the ratio between adjacent terms in the Fibonacci sequence approaches the "Golden Ratio," 1.61803399.... This post explains why. In a discussion with my fellow blogger, David Smith, I made the comment "any two numbers (at

0

Often, the first step of a SAS/IML program is to use the USE, READ, and CLOSE statements to read data from a SAS data set into a vector or matrix. There are several ways to read data: Read variables into vectors of the same name. Read one or more variables

0
Using data to define hurricane season

In a previous blog post about hurricanes, I created a histogram of the occurrence of tropical cyclones in the Atlantic basin during the years 1988–2003. That histogram shows that the peak of hurricane activity occurs in the second week of September, but also that a majority of tropical storms occur

0
Timing Performance: Looping versus LOC-ing

The SAS/IML language is a vector language, so statements that operate on a few long vectors run much faster than equivalent statements that involve many scalar quantities. For example, in a previous post, I asserted that the LOC function is much faster than writing a loop, for finding observations that

0
Comparing cell phone use by age

The Junk Chart blog discusses a potential problem that can arise in grouped bar charts when the two groups have vastly different ranges. One possible solution (which is discussed at the Junk Chart sister blog, Numbers Rule Your World) is to present the data back-back in what is sometimes called

0
Extending IML - Defining a Function Module

The SAS/IML run-time library contains hundreds of functions and subroutines that you can call to perform statistical analysis. There are also many functions in Base SAS software that you can call from SAS/IML programs. However, one day you might need to compute some quantity for which there is no prewritten

0
When is the peak of hurricane season?

Visualizing the distribution of data is a primary task of data analysis. With all the hurricane activity in the Atlantic this year, I’ve been thinking about ways to visualize the historical distribution of hurricane activity. USA Today on Friday, August 13, 2010, announced that "the heart of hurricane season is