SAS provides many built-in routines for data analysis. A previous article discusses polychoric correlation, which is a measure of association between two ordinal variables. In SAS, you can use PROC FREQ or PROC CORR to estimate the polychoric correlation, its standard error, and confidence intervals. Although SAS provides a built-in
Tag: Data Analysis
Correlation is a statistic that measures the association between two variables. When two variables are positively correlated, low values of one variable tend to be associated with low values of the other variable. Medium values and high values are similarly associated. For negative correlation, the association is flipped: low values
A previous article shows ways to perform efficient BY-group processing in the SAS IML language. BY-group processing is a SAS-ism for what other languages call group processing or subgroup processing. The main idea is that the data set contains several discrete variables such as sex, race, education level, and so
One thing I have learned about rank-based statistics over the years is "Be careful of tied values!" On multiple occasions, I have been asked, "Why doesn't the SAS result for [NAME] statistic agree with my hand calculation?" The answer is sometimes because of the way that tied values are handled.
Many SAS procedures support a BY statement that enables you to perform an analysis for each unique value of a BY-group variable. The SAS IML language does not support a BY statement, but you can program a loop that iterates over all BY groups. You can emulate BY-group processing by
There are many ways to model a set of raw data by using a continuous probability distribution. It can be challenging, however, to choose the distribution that best models the data. Are the data normal? Lognormal? Is there a theoretical reason to prefer one distribution over another? The SAS has
Does anyone write paper checks anymore? According to researchers at the Federal Reserve Bank of Atlanta (Greene, et al., 2020), the use of paper checks has declined 63% among US consumers since the year 2000. The researchers surveyed more than 3,000 consumers in 2017-2018 and discovered that only 7% of
In a previous article, I discussed the Wilcoxon signed rank test, which is a nonparametric test for the location of the median. The Wikipedia article about the signed rank test mentions a variation of the test due to Pratt (1959). Whereas the standard Wilcoxon test excludes values that equal μ0
Wilcoxon's signed rank test is a popular nonparametric alternative to a paired t test. In a paired t test, you analyze measurements for subjects before and after some treatment or intervention. You analyze the difference in the measurements for each subject, and test whether the mean difference is significantly different
A previous article discusses standardized coefficients in linear regression models and shows how to compute standardized regression coefficients in SAS by using the STB option on the MODEL statement in PROC REG. It also discusses how to interpret a standardized regression coefficient. Recently, a SAS user wanted to know how