I hope that the following statement is not too controversial...but here it goes: Microsoft Excel is not a database system. That is, I know that people do use it as a database, but it's not an application that supports the rigor and discipline of managing data in the same way
Uncategorized
This article shows how to randomly access data in a SAS data set by using the READ POINT statement in SAS/IML software. I have previously discussed how to use the READ NEXT and READ CURRENT statements to sequentially access each observation in a SAS data set from PROC IML. Reading
Andrew Ratcliffe posted a fine article titled "Inadequate Mends" in which he extols the benefits of including the name of a macro on the %MEND statement. That is, if you create a macro function named foo, he recommends that you include the name in two places: %macro foo(x); /** define
A fundamental operation in data analysis is finding data that satisfy some criterion. How many people are older than 85? What are the phone numbers of the voters who are registered Democrats? These questions are examples of locating data with certain properties or characteristics. The SAS DATA step has a
For years I've been making presentations about SAS/IML software at conferences. Since 2008, I've always mentioned to SAS customers that they can call R from within SAS/IML software. (This feature was introduced in SAS/IML Studio 3.2 and was added to the IML procedure in SAS/IML 9.22.) I also included a
When Charlie H. posted an interesting article titled "Top 10 most powerful functions for PROC SQL," there was one item on his list that was unfamiliar: the COALESCE function. (Edit: Charlie's blog no longer exists. The article used to be available at http://www.sasanalysis.com/2011/01/top-10-most-powerful-functions-for-proc.html) Ever since I posted my first response,
Last week the Flowing Data blog published an excellent visualization of the flight patterns of major US airlines. On Friday, I sent the link to Robert Allison, my partner in the 2009 ASA Data Expo, which explored airline data. Robert had written a SAS program for the Expo that plots
It's a simple task to use SAS to compute the number of weekdays between two dates. You can use the INTCK function with the WEEKDAY interval to come up with that number. diff = intck('WEEKDAY', start_date, end_date); If you want to compute the number of working days between two dates,
When I was at the annual SAS Global Forum conference, I had the pleasure of discussing statistical programming and SAS/IML software with dozens of SAS customers. I was asked at least ten times, "How do I get started with SAS/IML software?" or "How can I learn PROC IML?" Here is
This blog post shows how to numerically integrate a one-dimensional function by using the QUAD subroutine in SAS/IML software. The name "quad" is short for quadrature, which means numerical integration. You can use the QUAD subroutine to numerically find the definite integral of a function on a finite, semi-infinite, or
More than a month ago I wrote a first article in response to an interesting article by Charlie H. titled Top 10 most powerful functions for PROC SQL. In that article I described SAS/IML equivalents to the MONOTONIC, COUNT, N, FREQ, and NMISS Functions in PROC SQL. In this article,
The Spring 2011 issue of Foresight is now available. Here is Editor Len Tashman's preview: For forecasters, “being wrong” is the expectation; the hope is that we’re not too wrong. But admitting to our failures is never easy. The Spring 2011 issue leads off with Marcus O’Connor’s book review of
SAS Enterprise Guide is best known as an interactive interface to SAS, but did you know that you can use it to run batch-style programs as well? SAS Enterprise Guide has always offered an automation object model, which allows you to use scripting languages (such as VBScript or Windows PowerShell)
The most common way to read observations from a SAS data set into SAS/IML matrices is to read all of the data at once by using the ALL clause in the READ statement. However, the READ statement also has options that do not require holding all of the observations in
Congratulations to Curt Hinrichs and Chuck Boiler! Their book, JMP Essentials: An Illustrated Step-by-Step Guide for New Users, has won an Award of Distinguished Technical Communication in this year’s International Summit Awards presented by the Society for Technical Communication. The award goes to a project that “applies the principles of
In last week's article on how to create a funnel plot in SAS, I wrote the following comment: I have not adjusted the control limits for multiple comparisons. I am doing nine comparisons of individual means to the overall mean, but the limits are based on the assumption that I'm
Greg Nelson and Neil Howard presented a lunchtime keynote talk at SAS Global Forum, and they produced this video, "Revenge of the Semi-Colon People", to go along with it. The video features many people from the SAS community, including customers and SAS employees. Watch it and see if you know
The log transformation is one of the most useful transformations in data analysis. It is used as a transformation to normality and as a variance stabilizing transformation. A log transformation is often used as part of exploratory data analysis in order to visualize (and later model) data that ranges over
I've been walking around the last few days with what looks like a dollop of chocolate syrup or grape jelly on my chin. Alas, it is just a bruise from getting elbowed in the mouth at basketball last Thursday night. (Church leagues may be the only dirtier place to play
One of the advantages of programming in the SAS/IML language is its ability to transform data vectors with a single statement. For example, in data analysis, the log and square-root functions are often used to transform data so that the transformed data have approximate normality. The following SAS/IML statements create
When I encounter an ERROR, WARNING, or NOTE in my SAS log that I don't understand, my first recourse is to ask my friend (we'll call him "Google") what it could mean. I copy the entire message (or at least 5 or 6 consecutive words from it) into the search
Last week I showed how to create a funnel plot in SAS. A funnel plot enables you to compare the mean values (or rates, or proportions) of many groups to some other value. The group means are often compared to the overall mean, but they could also be compared to
A major news item this week is the New York Department of Health's labeling of childen's games like Kickball, Wiffleball, Freeze Tag, Red Rover, and Steal the Bacon as dangerous. (Apparently Spin the Bottle, Truth or Dare, and Doctor are still ok?) Is this the continuing wussification of American youth?
This is a guest post from Jodi Blomberg, a Principal Technical Architect at SAS. She has over 12 years of experience in data mining and mathematical modeling, and has developed analytic models for many government agencies including child support enforcement, insurance fraud, intelligence led policing, supply chain logistics and adverse
In our last installment, we learned that some information is not really necessary. When facilities management dyed the toilet water purple to remind us it is non-potable, it didn't affect my earlier decision not to drink out of the toilet. Sometimes the information we receive as forecasters is not really
Last week I presented the GSR algorithm, a statistical model of a riffle shuffle. In the model, a deck of n cards is split into two parts according to the binomial distribution. Each piece has roughly n/2 cards. Then cards are dropped from the two stacks according to the number
In a previous post, I showed how to read data from a SAS data set into SAS/IML matrices or vectors. This article shows the converse: how to use the CREATE, APPEND, and CLOSE statements to create a SAS data set from data stored in a matrix or in vectors. Creating
Unless you’ve been living under a rock, you’ve heard about the budget problems running rampant across all levels of government. Federal, State and Local Governments are all facing historic budget shortfalls due to the economic crisis and decreased tax receipts. This has led to a much closer examination of services
On March 28 I had the pleasure of moving to our new office building on the scenic SAS campus in Cary, NC. This aesthetic and functional structure houses the sales, marketing, and SAS executive management offices, as well as a generously appointed Executive Briefing Center for hosting our visiting customers.
In a previous blog post, I showed how you can use simulation to construct confidence intervals for ranks. This idea (from a paper by E. Marshall and D. Spiegelhalter), enables you to display a graph that compares the performance of several institutions, where "institutions" can mean schools, companies, airlines, or