Jittering. To a statistician, it is more than what happens when you drink too much coffee. Jittering is the act of adding random noise to data in order to prevent overplotting in statistical graphs. Overplotting can occur when a continuous measurement is rounded to some convenient unit. This has the
Tag: SAS Programming
In my statistical analysis of coupons article, I presented a scatter plot that includes the identity line, y=x. This post describes how to write a general program that uses the SGPLOT procedure in SAS 9.2. By a "general program," I mean that the program produces the result based on the
This is Part 4 of my response to Charlie Huang's interesting article titled Top 10 most powerful functions for PROC SQL. As I did for eaerlier topics, I will examine one of the "powerful" SQL functions that Charlie mentions and show how to do the same computation in SAS/IML software.
When Charlie H. posted an interesting article titled "Top 10 most powerful functions for PROC SQL," there was one item on his list that was unfamiliar: the COALESCE function. (Edit: Charlie's blog no longer exists. The article used to be available at http://www.sasanalysis.com/2011/01/top-10-most-powerful-functions-for-proc.html) Ever since I posted my first response,
Last week I showed how to create a funnel plot in SAS. A funnel plot enables you to compare the mean values (or rates, or proportions) of many groups to some other value. The group means are often compared to the overall mean, but they could also be compared to
String comparisons in SAS software are case-sensitive. For example, the uppercase letter "F" and lowercase letter "f" are treated as unique characters. When these two letters represent the same condition (for example, a female patient), the strings need to be handled in a case-insensitive manner, and a SAS programmer might
The Flowing Data blog posted some data about how much TV actors get paid per episode. About a dozen folks have created various visualizations of the data (see the comments in the Flowing Data blog), several of them very glitzy and fancy. One variable in the data is a categorical
As Cat Truxillo points out in her recent blog post, some SAS procedures require data to be in a "long" (as opposed to "wide") format. Cat uses a DATA step to convert the data from wide to long format. Although there is nothing wrong with this approach, I prefer to
A colleague related the following story: He was taking notes at a meeting that was attended by a fairly large group of people (about 20). As each person made a comment or presented information, he recorded the two-letter initials of the person who spoke. After the meeting was over, he
The Junk Chart blog discusses problems with a chart which (poorly) presents statistics on the prevalence of shark attacks by different species. Here is the same data presented by overlaying two bar charts by using the SGPLOT procedure. I think this approach works well because the number of deaths is