Monotonic transformations occur frequently in math and statistics. Analysts use monotonic transformations to transform variable values, with Tukey's ladder of transformations and the Box-Cox transformations being familiar examples. Monotonic distributions figure prominently in probability theory because the cumulative distribution is a monotonic increasing function. For a continuous distribution that is
Tag: Data Analysis
A SAS customer asked how to use the Box-Cox transformation to normalize a single variable. Recall that a normalizing transformation is a function that attempts to convert a set of data to be as nearly normal as possible. For positive-valued data, introductory statistics courses often mention the log transformation or
In the 1960s and '70s, before nonparametric regression methods became widely available, it was common to apply a nonlinear transformation to the dependent variable before fitting a linear regression model. This is still done today, with the most common transformation being a logarithmic transformation of the dependent variable, which fits
John Tukey was an influential statistician who proposed many statistical concepts. In the 1960s and 70s, he was fundamental in the discovery and exposition of robust statistical methods, and he was an ardent proponent of exploratory data analysis (EDA). In his 1977 book, Exploratory Data Analysis, he discussed a small
On Twitter, I saw a tweet from @DataSciFact that read, "The sum of (x_i - x)^2 over a set of data points x_i is minimized when x is the sample mean." I (@RickWicklin) immediately tweeted out a reply: "And the sum of |x_i - x| is minimized by the sample
In categorical data analysis, it is common to analyze tables of counts. For example, a researcher might gather data for 18 boys and 12 girls who apply for a summer enrichment program. The researcher might be interested in whether the proportion of boys that are admitted is different from the
In The Essential Guide to Bootstrapping in SAS, I note that there are many SAS procedures that support bootstrap estimates without requiring the analyst to write a program. I have previously written about using bootstrap options in the TTEST procedure. This article discusses the NLIN procedure, which can fit nonlinear
When you have many correlated variables, principal component analysis (PCA) is a classical technique to reduce the dimensionality of the problem. The PCA finds a smaller dimensional linear subspace that explains most of the variability in the data. There are many statistical tools that help you decide how many principal
Recently, I showed how to use a heat map to visualize measurements over time for a set of patients in a longitudinal study. The visualization is sometimes called a lasagna plot because it presents an alternative to the usual spaghetti plot. A reader asked whether a similar visualization can be
What is McNemar's test? How do you run the McNemar test in SAS? Why might other statistical software report a value for McNemar's test that is different from the SAS value? SAS supports an exact version of the McNemar test, but when should you use it? This article answers these