The DO Loop
Statistical programming in SAS with an emphasis on SAS/IML programs![The case of the missing blanks: Why SAS output might not show multiple blanks in strings](https://blogs.sas.com/content/iml/files/2021/06/blanks2a.png)
A SAS programmer noticed that his SAS output was not displaying multiple blanks in his strings. He had some strings with leading blanks, others with trailing blanks, and others with multiple blanks in the middle. Yet, every time he used SAS to print the strings to the HTML destination, something
![The geometry of the Iman-Conover transformation](https://blogs.sas.com/content/iml/files/2021/06/ImanConover5-678x336.png)
A previous article showed how to simulate multivariate correlated data by using the Iman-Conover transformation (Iman and Conover, 1982). The transformation preserves the marginal distributions of the original data but permutes the values (columnwise) to induce a new correlation among the variables. When I first read about the Iman-Conover transformation,
![Simulate correlated variables by using the Iman-Conover transformation](https://blogs.sas.com/content/iml/files/2021/06/ImanConover4-640x336.png)
Simulating univariate data is relatively easy. Simulating multivariate data is much harder. The main difficulty is to generate variables that have given univariate distributions but also are correlated with each other according to a specified correlation matrix. However, Iman and Conover (1982, "A distribution-free approach to inducing rank correlation among
![Rank-based scores and tied values](https://blogs.sas.com/content/iml/files/2021/06/NormalScores3-480x336.png)
Many nonparametric statistical methods use the ranks of observations to compute distribution-free statistics. In SAS, two procedures that use ranks are PROC NPAR1WAY and PROC CORR. Whereas the SPEARMAN option in PROC CORR (which computes rank correlation) uses only the "raw" tied ranks, PROC NPAR1WAY uses transformations of the ranks,
![Permutation tests and independent sorting of data](https://blogs.sas.com/content/iml/files/2021/06/permtest4-600x336.png)
For many univariate statistics (mean, median, standard deviation, etc.), the order of the data is unimportant. If you sort univariate data, the mean and standard deviation do not change. However, you cannot sort an individual variable (independently) if you want to preserve its relationship with other variables. This statement is
![The Hampel identifier: Robust outlier detection in a time series](https://blogs.sas.com/content/iml/files/2021/06/HampelID2-600x336.png)
It is well known that classical estimates of location and scale (for example, the mean and standard deviation) are influenced by outliers. In the 1960s, '70s, and '80s, researchers such as Tukey, Huber, Hampel, and Rousseeuw advocated analyzing data by using robust statistical estimates such as the median and the