## Programming Tips

Strengthen your programming skills with tips and techniques from the experts

0
Two ways to specify how SAS displays missing values

In statistical tables in SAS, a dot (.) represents a numerical missing value. Although a dot is the default symbol in SAS, other languages use other symbols. The R language prints the symbol NA, which stands for "not available." The MATLAB language uses NaN ("Not a Number"). In Python, many

0
The location of ticks in statistical graphics

Modern software for statistical graphics automatically handles many details and graph defaults, such as the range of the axes and the placement of tick marks. In the days of yore, these details required tedious manual calculations. Think about what is required to place ticks on a scatter plot. On the

0
Is a value in a vector? Use the ELEMENT function

In SAS, DATA step programmers use the IN operator to determine whether a value is contained in a set of target values. Did you know that there is a similar functionality in the SAS IML language? The ELEMENT function in the SAS IML language is similar to the IN operator

0
Efficient recursion: Store values that will be be reused

A previous article shows how to implement recursive formulas in SAS. The article points out that you can often avoid recursion by using an iterative algorithm, which is more efficient. An example is the Fibonacci sequence, which is usually defined recursively as      F(n) = F(n-1) + F(n-2) for n

0
An exact formula for the probability distribution for the sum of n dice

Many well-known distributions become more and more "normal looking" for large values of a parameter. Famously, the binomial distribution, Binom(p, N), can be approximated by a normal distribution when N (the sample size) is large. Similarly, the Poisson(λ) distribution is well approximated by the normal distribution when λ is large.

0
How to define a SAS IML function that has no arguments

The SAS IML Language has a quirk with regards to functions that take no arguments. As discussed in the documentation, "modules with arguments are given a local symbol table." This is the usual behavior that programmers expect. However, the documentation goes on to state that "a module that has no

0
Implement five sampling methods in the SAS DATA step

In SAS, the easiest way to draw random sampling from data is to use PROC SURVEYSELECT or the SAMPLE function in SAS IML software. I have previously written about how to implement four common sampling schemes by using PROC SURVEYSELECT and the SAMPLE function. The DATA step in SAS is

0
A geometric solution to isotonic regression

A previous article shows that you can run a simple (one-variable) isotonic regression by using a quadratic programming (QP) formulation. While I was reading a book about computational geometry, I learned that there is a connection between isotonic regression and the convex hull of a certain set of points. Whaaaaat?

0
QPSOLVE: A new SAS IML function for quadratic optimization

Since the pandemic began in 2020, the SAS IML developers have added about 50 new functions and enhancements to the SAS IML language in SAS Viya. Among these functions are new modern methods for optimization that have a simplified syntax as compared to the older 'NLP' functions that are available

0
Isotonic regression: An application of quadratic optimization

Isotonic regression (also called monotonic regression) is a type of regression model that assumes that the response variable is a monotonic function of the explanatory variable(s). The model can be nondecreasing or nonincreasing. Certain physical and biological processes can be analyzed by using an isotonic regression model. For example, a

0
Display the largest values for each group

A previous article discusses the fact that there are often multiple ways in SAS to obtain the same result. This fact results in many vigorous discussions on online programming forums as people propose different (but equivalent) methods for solving someone's problem then argue why their preferred method is better than

0
Create an interpolating polynomial in SAS

While reviewing a book on numerical analysis, I was reminded of a classic interpolation problem. Suppose you have n pairs of points in the plane: (x1,y1), (x2,y2), ..., (xn,yn), where the first coordinates are distinct. Then you can construct a unique polynomial of degree (at most) n-1 that passes through

0
Teaching an AI assistant to read and write SAS IML vectors

One of the most exciting features of SAS Viya Workbench is that the code editor includes a generative AI component called SAS Viya Copilot. This feature was announced and demonstrated at SAS Innovate 2024. With the Copilot, you can specify a text prompt that generates SAS code. For example, you

0
The distribution of the R-square statistic

A SAS analyst ran a linear regression model and obtained an R-square statistic for the fit. However, he wanted a confidence interval, so he posted a question to a discussion forum asking how to obtain a confidence interval for the R-square parameter. Someone suggested a formula from a textbook (Cohen,

0
Find the label of a variable in SAS

Sometimes labels for variables get "dropped" during data preparation and cleaning. One example is when data are transposed from "wide form" to "long form." For example, suppose a data set has three variables, X, Y, and Z, each with labels. If you transpose the data to long form, the new

0
Create filled density plots in SAS

A SAS programmer wanted to visualize density estimate for some univariate data. The data had several groups, so he wanted to create a panel of density estimate, which you can easily do by using PROC SGPANEL in SAS. However, the programmer's boss wanted to see filled density estimates, such as

0
Rank, order, and sorting

A SAS programmer was trying to implement an algorithm in PROC IML in SAS based on some R code he had seen on the internet. The R code used the rank() and order() functions. This led the programmer to ask, "What is the different between the rank and the order?

0
The distribution of p-values under the null hypothesis

A SAS statistical programmer recently asked a theoretical question about statistics. "I've read that 'p-values are uniformly distributed under the null hypothesis,'" he began, "but what does that mean in practice? Is it important?" I think data simulation is a great way to discuss the conditions for which p-values are

0
Dice and the correctness of a simulation

At a recent conference in Las Vegas, a presenter simulated the sum of two dice and used it to simulate the game of craps. I write a lot of simulations, so I'd like to discuss two related topics: How to simulate the sum of two dice in SAS. This is

1 2 3 63