When I read Robert Allison's article about the cost of a taxi ride in New York City, I was struck by the scatter plot (shown at right; click to enlarge) that plots the tip amount against the total bill for 12 million taxi rides. The graph clearly reveals diagonal and
Author
My previous post highlighted presentations at SAS Global Forum 2016 that heavily used SAS/IML software. Several of the authors clearly want to share their work with the wider SAS analytical community. They include their SAS/IML program in an appendix or mention a web site or email address from which the
Last week I attended SAS Global Forum 2016 in Las Vegas. I and more than 5,000 other attendees discussed and shared tips about data analysis and statistics. Naturally, I attended many presentations that featured using SAS/IML software to implement advanced analytical algorithms. Several speakers showed impressive mastery of SAS/IML programming
You can visualize missing data. It sounds like an oxymoron, but it is true. How can you draw graphs of something that is missing? In a previous article, I showed how you can use PROC MI in SAS/STAT software to create a table that shows patterns of missing data in
Missing data can be informative. Sometimes missing values in one variable are related to missing values in another variable. Other times missing values in one variable are independent of missing values in other variables. As part of the exploratory phase of data analysis, you should investigate whether there are patterns
I saw an interesting mathematical result in Wired magazine. The original article was about mathematical research into prime numbers, but the article included the following tantalizing fact: If Alice tosses a [fair]coin until she sees a head followed by a tail, and Bob tosses a coin until he sees two
The SG procedures in SAS use aesthetically pleasing default colors, shapes, and styles, but sometimes it is necessary to override the default attributes. The MARKERATTRS= option enables you to override the default colors, symbols, and sizes of markers in scatter plots and other graphs. Similarly, the LINEATTRS= option enables you
Last week I showed how to generate random points uniformly inside a 2-d circular region. That article showed that the distance of a point to the circle's center cannot be distributed uniformly. Instead, you should use the square root of a uniform variate to generate 2-D distances to the origin.
In SAS procedures, the WHERE clause is a useful way to filter observations so that the procedure receives only a subset of the data to analyze. The IML procedure supports the WHERE clause in two separate statements. On the USE statement, the WHERE clause acts as a global filter. The
It is easy to generate random points that are uniformly distributed inside a rectangle. You simply generate independent random uniform values for each coordinate. However, nonrectangular regions are more complicated. An instructive example is to simulate points uniformly inside the ball with a given radius. The two-dimensional case is to