Strengthen your programming skills with tips and techniques from the experts
SAS has a programming language, but IS that all it is? Nope, but it still ranks high as a most marketable programming skill.
Strengthen your programming skills with tips and techniques from the experts
SAS has a programming language, but IS that all it is? Nope, but it still ranks high as a most marketable programming skill.
Fitting a Gradient Boosting Model - Learn how to fit a gradient boosting model and use your model to score new data In Part 6, Part 7, and Part 9 of this series, we fit a logistic regression, decision tree and random forest model to the Home Equity data we
Modern software for statistical graphics automatically handles many details and graph defaults, such as the range of the axes and the placement of tick marks. In the days of yore, these details required tedious manual calculations. Think about what is required to place ticks on a scatter plot. On the
In SAS, DATA step programmers use the IN operator to determine whether a value is contained in a set of target values. Did you know that there is a similar functionality in the SAS IML language? The ELEMENT function in the SAS IML language is similar to the IN operator
A previous article shows how to implement recursive formulas in SAS. The article points out that you can often avoid recursion by using an iterative algorithm, which is more efficient. An example is the Fibonacci sequence, which is usually defined recursively as F(n) = F(n-1) + F(n-2) for n
Many well-known distributions become more and more "normal looking" for large values of a parameter. Famously, the binomial distribution, Binom(p, N), can be approximated by a normal distribution when N (the sample size) is large. Similarly, the Poisson(λ) distribution is well approximated by the normal distribution when λ is large.
There are two programming tools that I rarely use: the SAS macro language and recursion. The SAS macro language is a tool that enables you to generate SAS statements. I rarely use the SAS macro language because the SAS IML language supports all the functionality required to write complex programs,
The SAS IML Language has a quirk with regards to functions that take no arguments. As discussed in the documentation, "modules with arguments are given a local symbol table." This is the usual behavior that programmers expect. However, the documentation goes on to state that "a module that has no
In SAS, the easiest way to draw random sampling from data is to use PROC SURVEYSELECT or the SAMPLE function in SAS IML software. I have previously written about how to implement four common sampling schemes by using PROC SURVEYSELECT and the SAMPLE function. The DATA step in SAS is
A previous article shows that you can run a simple (one-variable) isotonic regression by using a quadratic programming (QP) formulation. While I was reading a book about computational geometry, I learned that there is a connection between isotonic regression and the convex hull of a certain set of points. Whaaaaat?
Since the pandemic began in 2020, the SAS IML developers have added about 50 new functions and enhancements to the SAS IML language in SAS Viya. Among these functions are new modern methods for optimization that have a simplified syntax as compared to the older 'NLP' functions that are available
What's the difference between LENGTH and FORMAT in a SAS data set? This article shares the answer, with examples.
Just like the SAS DATA step, the SAS IML language supports both functions and subroutines. A function returns a value, so the calling syntax is familiar: y = func(x1, x2); /* the function returns one value, y */ In this syntax, the input arguments are x1 and x2. The
SAS' Kirby Thomas demystifies joins, one of the more complicated data-merging tasks for new coders.
Isotonic regression (also called monotonic regression) is a type of regression model that assumes that the response variable is a monotonic function of the explanatory variable(s). The model can be nondecreasing or nonincreasing. Certain physical and biological processes can be analyzed by using an isotonic regression model. For example, a
Emojis are showing up in our data. Here's what you need to know when working with emojis in your SAS code.
A previous article discusses the fact that there are often multiple ways in SAS to obtain the same result. This fact results in many vigorous discussions on online programming forums as people propose different (but equivalent) methods for solving someone's problem then argue why their preferred method is better than
SAS' Kirby Thomas introduces a helpful coding shortcut for SAS users of all experience levels.
SAS' Kirby Thomas introduces a helpful coding shortcut for SAS users of all experience levels.
While reviewing a book on numerical analysis, I was reminded of a classic interpolation problem. Suppose you have n pairs of points in the plane: (x1,y1), (x2,y2), ..., (xn,yn), where the first coordinates are distinct. Then you can construct a unique polynomial of degree (at most) n-1 that passes through
One of the most exciting features of SAS Viya Workbench is that the code editor includes a generative AI component called SAS Viya Copilot. This feature was announced and demonstrated at SAS Innovate 2024. With the Copilot, you can specify a text prompt that generates SAS code. For example, you
Learn how to fit a random forest and use your model to score new data. In Part 6 and Part 7 of this series, we fit a logistic regression and decision tree to the Home Equity data we saved in Part 4. In this post we will fit a Random
A SAS analyst ran a linear regression model and obtained an R-square statistic for the fit. However, he wanted a confidence interval, so he posted a question to a discussion forum asking how to obtain a confidence interval for the R-square parameter. Someone suggested a formula from a textbook (Cohen,
Sometimes labels for variables get "dropped" during data preparation and cleaning. One example is when data are transposed from "wide form" to "long form." For example, suppose a data set has three variables, X, Y, and Z, each with labels. If you transpose the data to long form, the new
A SAS programmer wanted to visualize density estimate for some univariate data. The data had several groups, so he wanted to create a panel of density estimate, which you can easily do by using PROC SGPANEL in SAS. However, the programmer's boss wanted to see filled density estimates, such as
A SAS programmer was trying to implement an algorithm in PROC IML in SAS based on some R code he had seen on the internet. The R code used the rank() and order() functions. This led the programmer to ask, "What is the different between the rank and the order?
A SAS statistical programmer recently asked a theoretical question about statistics. "I've read that 'p-values are uniformly distributed under the null hypothesis,'" he began, "but what does that mean in practice? Is it important?" I think data simulation is a great way to discuss the conditions for which p-values are
At a recent conference in Las Vegas, a presenter simulated the sum of two dice and used it to simulate the game of craps. I write a lot of simulations, so I'd like to discuss two related topics: How to simulate the sum of two dice in SAS. This is
Years ago, I wrote an article that showed how to visualize patterns of missing data. During a recent data visualization talk, I discussed the program, which used a small number of SAS IML statements. An audience member asked whether it is possible to construct the same visualization by using only
In a recent article, I graphed the PDF of a few Beta distributions that had a variety of skewness and kurtosis values. I thought that I had chosen the parameter values to represent a wide variety of Beta shapes. However, I was surprised to see that the distributions were all