The DO Loop
Statistical programming in SAS with an emphasis on SAS/IML programsFinding nearest neighbors is an important step in many statistical computations such as local regression, clustering, and the analysis of spatial point patterns. Several SAS procedures find nearest neighbors as part of an analysis, including PROC LOESS, PROC CLUSTER, PROC MODECLUS, and PROC SPP. This article shows how to find
SAS formats are flexible, dynamic, and have many uses. For example, you can use formats to count missing values and to change the order of a categorical variable in a table or plot. Did you know that you can also use SAS formats to recode a variable or to bin
A kernel density estimate (KDE) is a nonparametric estimate for the density of a data sample. A KDE can help an analyst determine how to model the data: Does the KDE look like a normal curve? Like a mixture of normals? Is there evidence of outliers in the data? In
Last week I read an interesting paper by Bob Rodriguez: "Statistical Model Building for Large, Complex Data: Five New Directions in SAS/STAT Software." In it, Rodriguez summarizes five modern techniques for building predictive models and highlights recent SAS/STAT procedures that implement those techniques. The paper discusses the following high-performance (HP)
I'm addicted to you. You're a hard habit to break. Such a hard habit to break. — Chicago, "Hard Habit To Break" Habits are hard to break. For more than 20 years I've been putting semicolons at the end of programming statements in SAS, C/C++, and Java/Javascript. But lately I've been
Two of my favorite string-manipulation functions in the SAS DATA step are the COUNTW function and the SCAN function. The COUNTW function counts the number of words in a long string of text. Here "word" means a substring that is delimited by special characters, such as a space character, a