The other day, someone asked me how to compute a matrix of pairwise differences for a vector of values. The person asking the question was using SQL to do the computation for 2,000 data points, and it was taking many hours to compute the pairwise differences. He asked if SAS/IML
Uncategorized
I'm not supposed to be working on this blog post right now. I've stayed late at the office under the pretense of working on "the book." It's the book about creating custom tasks for SAS Enterprise Guide, and I've been working on it for quite a while. I enjoy writing
On Friday, I posted an article about using spatial statistics to detect whether a pattern of points is truly random. That day, one of my colleagues asked me whether there are any practical applications of detecting spatial randomness or non-randomness. "Oh, sure," I replied, and rattled off a list of
When you pass a matrix as an parameter (argument) to a SAS/IML module, the SAS/IML language does not create a copy of the matrix. That approach, known as "calling by value," is inefficient. It is well-known that languages that implement call-by-value semantics suffer performance penalties. In the SAS/IML language, matrices
Millions of Americans will be gathering around the television this Sunday to watch Super Bowl XLV. They'll gather in bars and private homes, prepare billions of calories worth of snacks, and root for their favorite teams. But if you're looking for an alternate form of entertainment, why not watch "New
Last week I generated two kinds of random point patterns: one from the uniform distribution on a two-dimensional rectangle, the other by jittering a regular grid by a small amount. My show choir director liked the second method (jittering) better because of the way it looks on stage: there are
We live in a world of digital communications, where social media provides the global population with the opportunity to come together like never before. This has brought a whole new dimension to consumer interaction. It provides instant channels for information exchange, experience and opinion sharing. Social media and multichannel digital
One of my New Year's resolutions is to learn a new area of statistics. I'm off to a good start, because I recently investigated an issue which started me thinking about spatial statistics—a branch of statistics that I have never formally studied. During the investigation, I asked myself: Given an
Last week I presented a SAS Talks session for SAS programmers using SAS Enterprise Guide 4.3. It was well attended, which pleased me. You never know how it's going to go with a webinar. People register and sign in, but they are at their desks in their offices/cubicles/homes where distractions
Editor's Note: The following question was recently asked of our statistical training instructors. Terry Woodfield, along with Bob Lucas took the time to write this eloquent and easily digestible answer. Question: I'm trying to get a general – very general – understanding what the Bayes theorem is, and is used
As Cat Truxillo points out in her recent blog post, some SAS procedures require data to be in a "long" (as opposed to "wide") format. Cat uses a DATA step to convert the data from wide to long format. Although there is nothing wrong with this approach, I prefer to
I sing in the SAS-sponsored VocalMotion show choir. It's like an adult version of Glee, except we have more pregnancies and fewer slushie attacks. For many musical numbers, the choreographer arranges the 20 performers on stage in an orderly manner, such as four rows of five singers. But every once
I have recently had the great opportunity to be a part of a very special project called the North Carolina Bio-Preparedness Collaborative (NCB-Prepared) It is a public-private partnership that includes the University of North Carolina at Chapel Hill (UNC), North Carolina State University, and SAS, with support from the US
Last week I talked about how I volunteered to serve as a judge for a middle-school science fair. As I expected, I enjoyed the experience quite a bit, and I hope the students got something positive from me as well. I evaluated several really impressive projects at the 7th grade
A histogram displays the number of points that fall into a specified set of bins. This blog post shows how to efficiently compute a SAS/IML vector that contains those counts. I stress the word "efficiently" because, as is often the case, a SAS/IML programmer has a variety of ways to
Have you ever wanted to compute the exact value of a really big number such as 200! = 200*199*...*2*1? You can do it—if you're willing to put forth some programming effort. This blog post shows you how. Jiangtang Hu's recent blog discusses his quest to compute large factorials in many programming languages.
Who doesn’t like bargains? I’m sure you will all agree that good quality at a next-to-nothing cost is irresistible. My recent Dollarama run had me ecstatic about the gloves that come in all colours, styles and sizes for just over a dollar. (Fact: big retail stores charge over 10 times
The other day I needed to check that a sequence of numerical values was in strictly increasing order. My first thought was to sort the values and compare the sorted and original values, but I quickly discarded that approach because it does not detect duplicate values in a montonic (nondecreasing)
It has become routine. For the 14th straight time – which is every year since its first publication in 1998 – SAS has made the Fortune “100 Best Companies to Work For” list. This includes eight appearances in the top ten, and in 2011, for the second year in a
In a previous post, I described ways to create SAS/IML vectors that contain uniformly spaced values. The methods did not involve writing any loops. This post describes how to perform a similar operation: creating evenly spaced values on a two-dimensional grid. The DATA step solution is simple, but an efficient
I'm not even at work yet, but I've already learned that SAS has been ranked as the #1 workplace on the Fortune 100 list for 2011. SAS was also number 1 last year in 2010, and has been high on the list since its inception. I'm sure there will be
Tomorrow I'll be taking a few hours away from work to build something important: the self-esteems of a handful of middle-school-aged children. I'm volunteering as a judge in a middle-school science fair. And even though I'm not a scientist ("computer science" isn't a category), I understand enough about physical science
"What is the chance that two people in a room of 20 share initials?" This was the question posed to me by a colleague who had been taking notes at a meeting with 20 people. He recorded each person's initials next to their comments and, upon editing the notes, was
A colleague posted some data on his internal SAS blog about key trends in the US Mobile phone industry, as reported by comScore. He graciously shared the data so that I could create a graph that visualizes the trends. The plot visualizes trends in the data: the Android phone is
When your data are in rows, but you need them in columns, use the matrix transpose function or operator. The same advice applies to data in columns that you want to be in rows. For example, the vectors created by the DO function and the index creation operator are row
A colleague related the following story: He was taking notes at a meeting that was attended by a fairly large group of people (about 20). As each person made a comment or presented information, he recorded the two-letter initials of the person who spoke. After the meeting was over, he
SAS/IML software is often used for sampling and simulation studies. For simulating data from univariate distributions, the RANDSEED and RANDGEN subroutines suffice to sample from a wide range of distributions. (I use the terms "sampling from a distribution" and "simulating data from a distribution" interchangeably.) For multivariate simulations, the IMLMLIB
AUTOEXEC.SAS wasn't enough for you. Yes, it's a sure-fire way to run SAS statements (such as LIBNAME assignments or macro definitions) whenever you start your SAS session, but you found it has limitations when used in configurations with lots of users who connect with SAS Enterprise Guide. Limitations such as:
It is often useful to create a vector with elements that follow an arithmetic sequence. For example, {1, 2, 3, 4} and {10, 30, 50, 70} are vectors with evenly spaced values. This post describes several ways to create vectors such as these. The SAS/IML language has two ways to
Computing probabilities can be tricky. And if you are a statistician and you get them wrong, you feel pretty foolish. That's why I like to run a quick simulation just to make sure that the numbers that I think are correct are, in fact, correct. My last post of 2010