The DO Loop
Statistical programming in SAS with an emphasis on SAS/IML programs
This article shows how to classify a set of high-dimensional data into orthants. An orthant is the d-dimensional generalization of a quadrant. For 2-D Euclidean space, there are four quadrants, often labeled by Roman numerals I-IV. The quadrants are open sets that are defined by the signs of each coordinate

This article shows how to use SAS to implement the ISO algorithm to generate (and validate) a Universal Loan Identifier (ULI). A ULI is a long string of numbers and letters that serves as a unique identifier that is used for certain financial transactions. The ISO standard ensures that banks

On social media, a SAS user reported that SAS could not compute the modulo of an extremely large integer. In SAS, the modulo operation is usually performed by using the MOD function, which computes the remainder of dividing an integer, N, by another integer, d. (In symbols, the remainder is

A previous article discusses the Gini-Simpson diversity index and how to compute it in SAS. Suppose you have a sample that contains R classes. (Classes are also called groups or categories.) Intuitively, the sample exhibits "high diversity" if the class sizes are approximately equal. The sample shows "low diversity" if

An article by David Corliss in Amstat News (Corliss D. (2025) "Quantifying Diversity: Calculating the Gini-Simpson Diversity Index") discusses a new statistical measure of diversity that was adopted by the US Census Bureau. The statistic is called the Gini-Simpson diversity index. The Census Bureau has published an article about how

When you use the bootstrap method in statistics, the most common resampling method is called case resampling. For data that has N observations, each bootstrap sample is created by sampling with replacement from the N observations (or "cases") in the data. However, if the data set includes categorical variables, it