One aspect of blogging that I enjoy is getting feedback from readers. Usually I get statistical or programming questions, but every so often I receive a comment from someone who stumbled across a blog post by way of an internet search. This morning I received the following delightful comment on
Uncategorized
If you want to extract values from a SAS/IML vector, use the subscripting operation, such as in the following example: proc iml; x = {A B C D E}; y = x[{1 2 3}]; /* {A,B,C} */ The vector y contains the first three elements of x. However, did you
Mean Absolute Percent Error (MAPE) is the most commonly used forecasting performance metric, and for good reason, the most disparaged. When we compute the absolute percent error the usual way, as APE = | Forecast - Actual | / Actual this is undefined when Actual = 0. It can also lead to
You can still get a paper proposal in for SAS Global Forum 2012. All you need is an idea. And probably some data. And also, some techniques for analyzing that data. Oh, and some conclusions would be helpful as well. I know: you are a busy person! You might not
Sometimes you want to label only certain observations in a plot. This is useful in many ways, but one use is to label outliers on a scatter plot. In the SGPLOT procedure, the DATALABEL= option enables you to specify the name of a variable that is used to label observations.
The popular mailing list for the SAS user community hits a milestone this weekend by turning 25. 25 is often referred to as the "silver anniversary", but for a quarter century SAS users have found gold among the messages in this list, which feature everything from questions and answers about
Drill-through to detail is the ability to right click within a cell of a web report or OLAP viewer and request the detail source records that make up that specific cell's measure. The maximum number of records, by default, is set to 300,000. Feasibly the report user could download all
I was at the Wikipedia site the other day, looking up properties of the Chi-square distribution. I noticed that the formula for the median of the chi-square distribution with d degrees of freedom is given as ≈ d(1-2/(9d))3. However, there is no mention of how well this formula approximates the
Celebrity fame (I'm told) is overrated. Do you really want hordes of people to recognize you in the shopping mall or while you wait at a red light? Of course you don't. And that's why I advise you to never win the American Idol competition (nor should you lose with
Sometimes you can't forecast worth a darn because something is just not forecastable. Being "unforecastable" doesn't mean you can't create a forecast, because you can always create a forecast. It just means there is so much instability or randomness in your demand patterns that even sophisticated forecasting methods don't help
JSM, Miami Beach, FL, July 31–August 3 Miami Beach in August is hot. Ridiculously hot. Almost as hot as our preview copies at this show. Conference goers were extremely excited about a number of our upcoming statistics titles, including Customer Segmentation and Clustering Using SAS® Enterprise Miner™, Second Edition, by
What do you call an interview on Twitter? A Tw-interview? A Twitter-view? Regardless of what you call it, I'm going to be involved in a "live chat" on Twitter this coming Thursday, 10NOV2011, 1:30–2:00pm ET. The hashtag is #saspress. Shelly Goodin (@SASPublishing) and SAS Press author recruiter Shelley Sessoms (@SSessoms)
Life Lesson from a Black Eye I'm fond of arguing that Plato is the father of philosophy. Apparently that is the wrong argument to make, in a bar, with a stranger, when said stranger takes the opposing view. (And I thought politics, religion, and his mother were the only things never
Last week I showed how to use the UNIQUE-LOC technique to iterate over categories in a SAS/IML program. The observant reader might have noticed that the algorithm, although general, could be made more efficient if the data are sorted by categories. The UNIQUEBY Technique Suppose that you want to compute
Welcome to this new blog on data visualization at SAS. Our goal is to engage with you on a discussion about analytical and business graphics for reporting and interactive applications. Our primary focus will be on ODS Graphics and related topics, but we look forward to a lively discussion on all things
Among the suitable-for-blog-publication-without-risking-my-job definitions of masochism is this: A willingness or tendency to subject oneself to unpleasant or trying experiences So to be a forecaster, must you also be a masochist? Few people enjoy the difficulties and degradation that go with being a forecaster, so few are willing to do it
On September 10, 2001, I was attending a law enforcement conference in Atlantic City, NJ. While I have attended hundreds of similar meetings, this conference stands out for several reasons. First, and most obvious, it was the eve of the day where most of our lives were indelibly altered. Second,
Being able to reshape data is a useful skill in data analysis. Most of the time you can use the TRANSPOSE procedure or the SAS DATA step to reshape your data. But the SAS/IML language can be handy, too. I only use PROC TRANSPOSE a few times per year, so
This blog post is a "mashup" of a couple of my previous posts, combining the lessons to create something brand new that I hope you will find useful. First, let's review what we know: SAS Enterprise Guide supports a scriptable object model, which allows you to write scripts or programs
A common mistake in bad or misused software is choosing a forecasting model based solely on the model’s “fit to history” (often referred to as “best fit” or “pick best” functionality). The software provides (or the forecaster builds) several competing models which are then evaluated against recent history. The model
I was recently asked the following question: I am using bootstrap simulations to compute critical values for a statistical test. Suppose I have test statistic for which I want a p-value. How do I compute this? The answer to this question doesn't require knowing anything about bootstrap methods. An equivalent
When you analyze data, you will occasionally have to deal with categorical variables. The typical situation is that you want to repeat an analysis or computation for each level (category) of a categorical variable. For example, you might want to analyze males separately from females. Unlike most other SAS procedures,
Of course, forecasting the stock market is not perfectly analogous to forecasting demand for a product. The asking price for a stock is largely "anchored" by the price of its most recent trades. While market values may appear to randomly drift up and down, or in a general direction, we generally
In this second of three flash reports from last week's Analytics2011 conference, we hear about a favorite topic of mine -- the relationship between demand volatility and forecastability. Rob Miller of Avantor Performance Materials, on Forecastability and Demand Volatility The "comet chart," illustrating the relationship between demand volatility and forecast
We'll interrupt the series on Why Forecasts are Wrong, with a report from the inaugural Analytics 2011 conference, held last week in Orlando. A2011 drew over 1025 attendees (from 44 states and over 25 counties). The Analytics conference series features a wide range of topics (including forecasting, optimization, data mining, text
In SAS/IML 9.22 and beyond, you can call the R statistical programming language from within a SAS/IML program. The syntax is similar to the syntax for calling SAS from SAS/IML: You use a SUBMIT statement, but add the R option: SUBMIT / R. All statements in the program between the
This week's featured tip is from master SAS user Art Carpenter and his classic book Carpenter's Complete Guide to the SAS REPORT Procedure. In his review for the book, Rick Mitchell-senior systems analyst at Westat-said "I am green with envy for the newest generation of SAS programmers because I wish that I had had this book in
"I think that my data are exponentially distributed, but how can I check?" I get asked that question a lot. Well, not specifically that question. Sometimes the question is about the normal, lognormal, or gamma distribution. A related question is "Which distribution does my data have," which was recently discussed
Years ago and a seemingly far galaxy away, I wrote about how to modify 9.1.3 to start Enterprise Guide users in a different location for the File folder. By default, the user only can access their personal SAS Temporary File. Why change this? I would prefer to use a central
I was contacted by SAS Technical Support regarding a customer who was trying to use SAS/IML to compute quantiles of the folded normal distribution. I had heard of the distribution, but it is not built into SAS and I had never worked with it. Nevertheless, I set out to understand