John Tukey was an influential statistician who proposed many statistical concepts. In the 1960s and 70s, he was fundamental in the discovery and exposition of robust statistical methods, and he was an ardent proponent of exploratory data analysis (EDA). In his 1977 book, Exploratory Data Analysis, he discussed a small
Author
On Twitter, I saw a tweet from @DataSciFact that read, "The sum of (x_i - x)^2 over a set of data points x_i is minimized when x is the sample mean." I (@RickWicklin) immediately tweeted out a reply: "And the sum of |x_i - x| is minimized by the sample
A SAS programmer asked for help on a discussion forum: "My SAS session will not display any tables or graphs! I try to use PROC PRINT and other procedures, but no output is displayed! What can I do?" The most common reasons why you might not see any output when
When I was writing Simulating Data with SAS (Wicklin, 2013), I read a lot of introductory textbooks about Monte Carlo simulation. One of my favorites is Sheldon Ross's book Simulation. (I read the 4th Edition (2006); the 5th Edition was published in 2013.) I love that the book brings together
I've previously shown how to use Monte Carlo simulation to estimate probabilities and areas. I illustrated the Monte Carlo method by estimating π ≈ 3.14159... by generating points uniformly at random in a unit square and computing the proportion of those points that were inside the unit circle. The previous
It isn't easy to draw the graph of a function when you don't know what the graph looks like. To draw the graph by using a computer, you need to know the domain of the function for the graph: the minimum value (xMin) and the maximum value (xMax) for plotting
A colleague was struggling to compute a right-tail probability for a distribution. Recall that the cumulative distribution function (CDF) is defined as a left-tail probability. For a continuous random variable, X, with density function f, the CDF at the value x is F(x) = Pr(X ≤ x) = ∫
A SAS programmer wanted to create a panel that contained two of the graphs side-by-side. The graphs were created by using calls to two different SAS procedures. This article shows how to select the graphs and arrange them side-by-side by using the ODS LAYOUT GRIDDED statement. The end of the
I previously wrote about partial leverage plots for regression diagnostics and why they are useful. You can generate a partial leverage plot in SAS by using the PLOTS=PARTIALPLOT option in PROC REG. One useful property of partial leverage plots is the ability to graphically represent the null hypothesis that a
Many people know that you can use "WHERE processing" in SAS to filter observations. A typical use is to process only observations that match some criterion. For example, the following WHERE statement processes only observations for male patients who have high blood pressure: WHERE Sex='Male' & Systolic > 140; In