In a previous post I showed how to implement Stewart's (1980) algorithm for generating random orthogonal matrices in SAS/IML software. By using the algorithm, it is easy to generate a random matrix that contains a specified set of eigenvalues. If D = diag(λ1, ..., λp) is a diagonal matrix and
Uncategorized
The March 28 edition of APICS extra features an article by Fred Tolbert on "The Seven Deadly Sins of Sales Forecasting." Although I have some objection to his Deadly Sin #1: Using Shipment History (and will discuss the objection in a forthcoming guest-post on the Institute of Business Forecasting blog),
Even Rod Serling recognized that sometimes we can't forecast worth a darn. "The Rip Van Winkle Caper" is an episode from Season 2 of the television series, The Twilight Zone, and first aired in 1961. It involves four train robbers who steal a million dollars worth of gold bars, hide
Because I am writing a new book about simulating data in SAS, I have been doing a lot of reading and research about how to simulate various quantities. Random integers? Check! Random univariate samples? Check! Random multivariate samples? Check! Recently I've been researching how to generate random matrices. I've blogged
With the publication of JSL Companion: Applications of the JMP Scripting Language, by Theresa Utlaut, Georgia Morgan, and Kevin Anderson, novice scripters now have a resource that helps them go beyond the basics of the JMP Scripting Language (JSL). Why JSL? The authors have the answers: 1. Easy to start
SAS Global Forum 2012 is right around the corner. If you will be in Orlando, too, be sure to say hello! If you have ideas for improving SAS/IML software or you would like to discuss my blog, please visit me during my hours at the SAS/IML booth in the Demo
The fundamental units in the SAS/IML language are matrices and vectors. Consequently, you might wonder about conditional expression such as if v>0 then.... What does this expression mean when v contains more than a single element? Evaluating vector expressions When you test a vector for some condition, expressions like v>0
Did that set off a trigger for you? It did for my SAS SQL 1: Essentials class, packed with SQL and SAS programmers alike. To clarify matters I pulled up some examples to help get the differences quickly. Set operators and Joins are similar in that they both combine multiple
Earlier this week I described a common programming pattern in the SAS macro language. The pattern sets up a loop for processing each distinct value of a classification variable. The program uses the PROC SQL SELECT INTO feature to populate SAS macro variables. The effect: you can roll your own
As public safety officials leaf through their favorite criminal justice periodical they are greeted with pages and pages of analytics advertisements. These ads are laden with promises of robust and scalable solutions, improved efficiencies and, yes, the promise of prediction. While reading the advertisements, the mental conversation may go something
After my post on detecting outliers in multivariate data in SAS by using the MCD method, Peter Flom commented "when there are a bunch of dimensions, every data point is an outlier" and remarked on the curse of dimensionality. What he meant is that most points in a high-dimensional cloud
Covariance, correlation, and distance matrices are a few examples of symmetric matrices that are frequently encountered in statistics. When you create a symmetric matrix, you only need to specify the lower triangular portion of the matrix. The VECH and SQRVECH functions, which were introduced in SAS/IML 9.3, are two functions
How to write a SAS macro program to repeat your SAS processing for each value of a BY grouping variable.
Art Carpenter’s newest book, Carpenter’s Guide to Innovative SAS Techniques, offers advanced SAS programmers an all-in-one programming reference that includes advanced topics not easily found outside the depths of SAS documentation or more advanced training classes. No matter how you approach the use of SAS software, the techniques provided in
The SAS/IML language supports both row vectors and column vectors. This is useful for performing linear algebra, but it can cause headaches when you are writing a SAS/IML module. I want my modules to be able to handle both row vectors and column vectors. I don't want the user to
When the data is classified by multiple class variables, you can certainly create graphs using BY variables. This results in separate graphs, one for each level of the BY variable crossings. Each graph is scaled by its own data subset, and comparisons across BY levels is harder. When comparisons need to be
When you are constantly taking the data tables and completing joins to begin working on your reports or analysis it might be time to consider creating permanent views. Then you can just add the view to the Enterprise Guide project rather than dealing with the joins in a Query Builder
A recent discussion on the SAS-L discussion forum concerned how to implement linear interpolation in SAS. Some people suggested using PROC EXPAND in SAS/ETS software, whereas others proposed a DATA step solution. For me, the SAS/IML language provides a natural programming environment to implement an interpolation scheme. It also provides
A well-formed WHERE statement or subsetting IF can narrow down the output of your SAS DATA step. The SAS log does a good job of telling you how many records were processed by the action. For example, let's look at this simple DATA step with my "poor man's random sample",
A few weeks ago, in Northern Virginia, a 30 foot highway sign fell onto I-66 and landed on a passing pickup truck. Fortunately, no one was hurt, but it drew media attention and caused motorists in the area to wonder about the safety of other signs and the transportation network
There is something that 90% of us admit to doing, and the other 10% will lie about. That, of course, is Googling yourself. As an avid follower of myself, and everything I do, I look forward to a weekly Google Alert that tells me all about what I've been up to.
After unwittingly getting involved recently in a code vs GUI discussion another pro GUI vote came in yesterday when presenting to a customer's internal user group. When creating and using prompts in SAS Enterprise Guide, it is a no-brainer to recommend leveraging the %_eg_WhereParam as it handles all the special
Most statistical programmers have seen a graph of a normal distribution that approximates a binomial distribution. The figure is often accompanied by a statement that gives guidelines for when the approximation is valid. For example, if the binomial distribution describes an experiment with n trials and the probability of success
Did you oversleep this morning? If you live in the United States of America, Monday morning seems to have arrived just a bit earlier, accompanied by a bit more "dark" than usual. That's because as good time-fearing citizens, we have all set our clocks ahead by one hour so as
SAS provides several ways to compute sample quantiles of data. The UNIVARIATE procedure can compute quantiles (also called percentiles), but you can also compute them in the SAS/IML language. Prior to SAS/IML 9.22 (released in 2010) statistical programmers could call a SAS/IML module that computes sample quantiles. With the release
In the United States, this upcoming weekend is when we turn our clocks forward one hour as we adopt daylight saving time. (Some people will also flip their mattresses this weekend!) Daylight saving time (DST) in the US begins on the second Sunday in March and ends on the first
Apparently the prolonged use of OxyContin will give you a pompous and surley demeanor, and make you say a lot of really ignorant things. So I implore you, dear readers, to withhold your use of such a substance, preserve your good attitude and brain cells, and participate in a research study
A few years ago I had the privilege of presenting the last technical paper at SAS Global Forum. This year, conference chair Andy Kuligowski asked me to go one better than that, and present a talk at the official Closing Session. What will I talk about? That's a mystery (maybe
During IFSUG yesterday, Sunil Gupta gave attendees to his presentation a special homework assignment. Look into the SAS Enterprise Guide task 'Characterize Data'. Sunil suggested that this was a simple approach to quickly getting a summary of all the variables within your data table. Of course, some programmers will use
I work with continuous distributions more often than with discrete distributions. Consequently, I am used to thinking of the quantile function as being an inverse cumulative distribution function (CDF). (These functions are described in my article, "Four essential functions for statistical programmers.") For discrete distributions, they are not. To quote