Move beyond spreadsheets to data mining, forecasting, optimization – and more

0
March Madness and Predictive Modeling

In my region of North Carolina (Raleigh, Durham, and Chapel Hill) one of the most anticipated times of the year has arrived— the NCAA basketball tournament. This is a great time of year for me, because I get to combine several of my passions. For those who don’t live among crazed college

0
Define functions with optional parameters in SAS/IML

Last month I blogged about defining SAS/IML functions that have default parameter values. This language feature, which was introduced in SAS/IML 12.1, enables you to skip arguments when you call a user-defined function. The same technique enables you to define optional parameters. Inside the function, you can determine whether the

0
Finding elements in one vector that are not in another vector

The SAS/IML language has several functions for finding the unions, intersections, and differences between sets. In fact, two of my favorite utility functions are the UNIQUE function, which returns the unique elements in a matrix, and the SETDIF function, which returns the elements that are in one vector and not

0
For pi day: A continued fraction expansion of pi

Many geeky mathematical people celebrate "pi day" on March 14, because the date is written 3/14 in the US, which is evocative of the decimal representation of π = 3.14.... Most people are familiar with the decimal representation of π. The media occasionally reports on a new computational tour-de-force that

0
Techniques for scoring a regression model in SAS

My previous post described how to use the "missing response trick" to score a regression model. As I said in that article, there are other ways to score a regression model. This article describes using the SCORE procedure, a SCORE statement, the relatively new PLM procedure, and the CODE statement.

0
The missing value trick for scoring a regression model

A fundamental operation in statistical data analysis is to fit a statistical regression model on one set of data and then evaluate the model on another set of data. The act of evaluating the model on the second set of data is called scoring. One of first "tricks" that I

0
Does the architecture matter as much as the analytics?

I was recently part of team discussing enterprise architecture with a chief IT architect, and we were explaining how SAS can integrate into their existing infrastructure, add business value on top it and even fit into their future planned infrastructure.  This conversation was one of the reasons I blogged about

0
13 popular articles from 2013

In 2013 I published 110 blog posts. Some of these articles were more popular than others, often because they were linked to from a SAS newsletter such as the SAS Statistics and Operations Research News. In no particular order, here are some of my most popular posts from 2013, organized

0
How WAVELETS can help separate the signal from the noise

Wavelet analysis is an exciting and relatively new field of study that enables one to extract underlying patterns either from spatially varying or temporally varying data.  Pixel values representing the relative brightness and color that constitute an image are an example of spatially varying data, and daily variations of financial

0
Generate combinations in SAS

Last week I described how to generate permutations in SAS. A related concept is the "combination." In probability and statistics, a combination is a subset of k items chosen from a set that contains N items. Order does not matter, so although the ordered triplets (B, A, C) and (C,

0
Compute contours of the bivariate normal CDF

This is the last post in my recent series of articles on computing contours in SAS. Last month a SAS customer asked how to compute the contours of the bivariate normal cumulative distribution function (CDF). Answering that question in a single blog post would have resulted in a long article,

0
Generate permutations in SAS

I've written several articles that show how to generate permutations in SAS. In the SAS DATA step, you can use the ALLPEM subroutine to generate all permutations of a DATA step array that contain a small number (18 or fewer) elements. In addition, the PLAN procedure enables you to generate