Edward Vonesh is a managing member of Vonesh Statistical Consulting, LLC, as well as a part-time employee of Northwestern University, where he supports research in his capacity as professor in the Department of Preventive Medicine. Needless to say, he knows a thing or two about generalized linear and nonlinear models.
Uncategorized
The Perils Revisited A few posts ago I warned of the perils of forecasting benchmarks, and why they should not be used to set your forecasting performance objectives: Can you trust the data? Is measurement consistent across the respondents? Is the comparison relevant? In addition to a general suspicion about
When I studied math in school, I learned that the expression a (mod n) is always an integer between 0 and q – 1 for integer values of a and q. It's a nice convention, but SAS and many other computer languages allow the result to be negative if a (or q) is
This week's SAS tip is from A. John Bailer and his book Statistical Programming in SAS. A Fellow of the American Statistical Association, John has been using SAS for 30 years. He's also Distinguished Professor and Chair of the Department of Statistics at Miami University. To read a free chapter and user reviews
Leaving Las Vegas Prince Harry, who recently gambled away a handful of the royal family jewels during a high-stakes billiards game, doesn't have to be the only person to leave Las Vegas with some important lessons learned. You can, too, by attending the Analytics2012 conference at Caesar's Palace, October 8-9. Learnings
Regular expressions provide a powerful method to find patterns in a string of text. However, the syntax for regular expressions is somewhat cryptic and difficult to devise. This is why, by my reckoning, approximately 97% of the regular expressions used in code today were copied and pasted from somewhere else.
The other day I was using PROC SGPLOT to create a box plot and I ran a program that was similar to the following: proc sgplot data=sashelp.cars; title "Box Plot: Category = Origin"; vbox Horsepower / category=origin; run; An hour or so later I had a need for another box
Facebook has millions of users, and therefore when people share an interesting graph on Facebook it can "go viral" and millions of people might see it. Some of the graphs are obviously a bit biased - especially ones that are trying to sway your opinion one way or another on a topic
The SAS/IML language supports user-defined functions (also called modules). Many SAS/IML programmers know that you can use the RETURN function to return a value from a user-defined function. For example, the following function returns the sum of each column of matrix: proc iml; start ColSum(M); return( M[+, ] ); /*
It is common to want to extract the lower or upper triangular elements of a matrix. For example, if you have a correlation matrix, the lower triangular elements are the nontrivial correlations between variables in your data. As I've written before, you can use the VECH function to extract the
The project that I'm currently working on requires several input data tables, and those tables must have a specific schema. That is, each input table must contain columns of a specific name, type, and length in order for the rest of the system to function correctly. The schema requirements aren't
Recently, there has been lot of uproar and confusion about the Supreme Court ruling on the constitutionality of the Affordable Care Act. Many were surprised by the ruling, and others, while happy it was upheld, are concerned about the constitutional questions that arose due to the way the ruling was
I'm working on a SAS programming project with a large team. Each team member is responsible for a piece of the overall system, and the "contract" for how it all fits together is The Data. For example, I've got a piece that performs some data manipulation and produces several output
In the course of my job, I get to have a lot of conversations with authors about their books. One of the aspects of those conversations I enjoy most is learning about their areas of expertise and knowledge—that could be certain SAS software or programming techniques, particular fields of analytics,
Sometimes a small option can make a big difference. Last week I thought to myself, "I wish there were an option that prevents variable labels from appearing in a table or graph." Well, it turns out that there is! I was using PROC MEANS to display some summary statistics, and
Do you use SAS for analytics and Microsoft Excel for graphs? Why not use SAS for your graphs too?!? Then you could completely automate the entire process in one SAS program, with no manual steps! A lot of people use Excel to create their graphs because "it's what they know." What if somebody
I've seen analyses of Fisher's iris data so often that sometimes I feel like I can smell the flowers' scent. However, yesterday I stumbled upon an analysis that I hadn't seen before. The typical analysis is shown in the documentation for the CANDISC procedure in the SAS/STAT documentation. A (canonical)
When the Western Users of SAS Software gather in Long Beach, CA this September, I'll be proud to be counted among the WUSSers. (You can learn more about WUSS here; don't look here.) The WUSS organizers must have some serious clout, because the line-up of presenters reads like a "Who's
The SAS System provides users with the ability to create, store and access custom functions using the Function Compiler (FCMP) procedure. Once defined with PROC FCMP, a user-defined function can be used, or called, just like any other SAS function in the SAS System. This powerful capability gives users the
Everyone in the world has their attention turned towards the Olympics this week, so what better topic to tie in to a SAS/GRAPH blog than that?!?! I had seen a graph on the guardian website that I thought was interesting, so I decided to try to create my own (slightly different)
A comment to last week's article on "How to get data values out of ODS graphics" indicated that the technique would be useful for changing the title on an ODS graph "without messing around with GTL." You can certainly use the technique for that purpose, but if you want to
Many SAS procedures can produce ODS statistical graphics as naturally as they produce tables. Did you know that it is possible to obtain the numbers underlying an ODS statistical graph? This post shows how. Suppose that a SAS procedure creates a graph that displays a curve and that you want
My sleep patterns are erratic (and somewhat torturous) – they range from sleeping solidly for eight hours a clip to me wandering aimlessly about the house at 3am. Unfortunately, the latter was the reality during the wee hours of Friday, July 20; I was up watching ESPN (my typical late
Hopefully you know that a gif animation can be used for more than just showing a cartoon animal doing cute tricks! Being a savvy data-meister, I'm sure you are also aware that you can use gif animations to see how data changes over time. But perhaps you didn't know you could
If you need to calculate the mean, sum, standard deviation, or frequency count for a variable, you'll find it pretty easy to accomplish in SAS Enterprise Guide. The corresponding tasks in the menus have names like "Summary Statistics" or "One-way Frequencies". Obvious, right? Often, researchers or students have a quest
I received the following question: In the DATA step I always use the ** operator to raise a values to a power, like this: x**2. But on your blog I you use the ## operator to raise values to a power in SAS/IML programs. Does SAS/IML not support the **
Uncontrolled product proliferation can have bad consequences, and these are well recognized. There is certainly extra cost and complexity in managing more SKUs (rather than fewer SKUs). And it is unlikely that each new offering adds entirely incremental volume. Instead, the increased product overlap just leads to increased self-cannibalization. We
Fire department operations are very complex, with multi-faceted missions that include not only fire prevention and suppression, but emergency response and fire inspections. These must be coordinated with area growth and development decisions, and water system management decisions. When a fire or an emergency occurs, the right equipment, with the right people,
When working with "big data" you usually have too many points to view in a plot, and end up subsetting or summarizing the data. But now, in SAS 9.3, you have an alternative! For example, the following scatter plot of 10,000+ points is just a visual "blob": But using a new
Last week I wrote an article in which I pointed out that many SAS programmers write a simulation in SAS by writing a macro loop. This approach is extremely inefficient, so I presented a more efficient technique. Not only is the macro loop approach slow, but there are other undesirable