This article introduces the iml action, which is available in SAS Viya 3.5. The iml action supports most of the same syntax and functionality as the SAS/IML matrix language, which is implemented in PROC IML. With minimal changes, most programs that run in PROC IML also run in the iml action. In addition, the iml action supports new programming features for parallel programming.
Most actions in SAS Viya perform a specific task, but the iml action is different. The iml action provides a set of general programming tools that you can use to implement a custom parallel algorithm. The programmer can control many aspects of the computation, including how the computation is distributed among nodes and threads on a cluster of machines (or threads on a single machine).
Future articles will address the parallel programming capabilities of the iml action. This article provides an overview of the iml action. What is it? How do you get access to it? How is it similar to and different from PROC IML?
What is the iml action?
Recall that the SAS/IML language is a matrix-vector programming language that supports a rich library of functions in statistics, data analysis, matrix computations, numerical analysis, simulation, and optimization. In SAS 9 ("traditional SAS"), you can access the SAS/IML language by licensing the SAS/IML product and calling the IML procedure. PROC IML is also available in the SAS University Edition.
In SAS Viya 3.5. you get access to the SAS/IML language by licensing the SAS IML product. (Notice that there is no “slash” in the product name.) The SAS IML product gives you access to the iml action and to the IML procedure. Thus, in Viya, you can run all existing PROC IML programs, and you can also write new programs that run in the iml action and use SAS Cloud Analytic Services (CAS).
The iml action belongs to the iml action set. In addition to supporting most of the statements and functions in the SAS/IML language, the iml action supports new functionality that enables you to take advantage of the distributed computational resources in SAS Viya. In particular, you can use the iml action to implement custom parallel algorithms that use multiple nodes and threads on a cluster of machines. Even on one machine, you can run custom parallel programs on a multicore processor.
How is the iml action similar to PROC IML?
The iml action and the IML procedure share a common syntax. The mathematical and statistical function library is essentially the same in the action and in the procedure. Both environments support arithmetic and linear algebraic operations on matrices, operations to subset and query matrices, and programming features such as writing loops and using IF-THEN/ELSE logic.
Of the 300 functions and statements in the SAS/IML run-time library, only a handful of statements are not supported in the iml action. Most differences are related to the difference between the SAS 9 and SAS Viya environments. PROC IML interacts with traditional SAS constructs (such as data sets and catalogs) and supports calling SAS procedures and interacting with files on your local computer. The iml action interacts with analogous constructs in the Viya environment. It can read and write CAS tables, write analytic stores (astores), and can call other Viya actions.
Why use the iml action?
The iml action runs on a CAS server. Why might you choose to use the iml action instead of PROC IML? Or convert an existing PROC IML program into the iml action? There are two main reasons:
- You want to use the SAS/IML language as part of a sequence of actions that analyze data that are in CAS tables. By using the iml action, you can read and write CAS tables directly. If you use PROC IML, you need to pull the data from CAS into a SAS data set, run the analysis in PROC IML, and then push the results to a CAS table.
- You want to take advantage of the capabilities of the CAS server to perform parallel processing. You can use the iml action to create custom parallel computations.
How is the iml action different from PROC IML?
As mentioned previously, the iml action does not support every function and statement that PROC IML supports. The unsupported functions and statements are primarily in four areas:
- Base SAS functions that are not supported in the CAS DATA step. For example, the old random number generator functions RANUNI and RANNOR are not supported in CAS because they cannot generate independent streams of random number in parallel.
- Statements that read or write SAS data sets or text files.
- Functions that create graphics. CAS is for computations. You can download the results of the computation to whatever language you are using to call the CAS actions. For example, I download the results to SAS and create SAS graphs, but you could also use Python or R.
- SAS/IML functions that were deprecated in earlier releases of SAS.
See the documentation for the iml action for a complete list of the PROC IML functions and statements that are not supported in the iml action.
Will my program run faster in the iml action than PROC IML?
If you take an existing PROC IML program and run it in the iml action, it will take about the same amount of time to run. Sure, it might run a little faster if the machines in your CAS cluster are newer and more powerful than the SAS server or your PC, but that speedup is due to hardware. An existing program does not automatically run faster in the iml action because it runs serially until it encounters a programming statement that can be executed in parallel. There are two main sets of statements that run in parallel:
- Reading and writing data from CAS tables.
- Functions that distribute computations. You, the programmer, need to call these functions to write parallel programs.
So, yes, you can get certain programs to run faster in the iml action, but it doesn't happen automatically. You have to add new input/output statements or call functions that execute tasks in parallel.
Should you convert from PROC IML to the iml action?
What does this mean for the SAS/IML programmer whose company is changing from SAS 9 to Viya? Do you need to convert hundreds of existing PROC IML programs to run in the iml action? No, absolutely not. As mentioned previously, when you license SAS IML on Viya, you get both PROC IML and the iml action. The existing programs that you wrote in SAS 9 will continue to run in PROC IML in Viya.
The Viya platform provides an opportunity to use the iml action but does not require it. Under what circumstances might you want to convert a program from PROC IML to the iml action? Or write a new program in the iml action instead of using PROC IML? In my opinion, it comes down to two issues: workflow and performance.
- Workflow: If your company is using CAS actions and CAS-enabled procedures, and if your data are stored in CAS tables, then it makes sense to use the iml action instead of the IML procedure. The SAS/IML language is often used to pre- or post-process data for other procedures or actions. The iml action can read and write CAS tables that are created by other CAS actions or that will be consumed by other CAS actions.
- Performance: Suppose that you have a computation that takes a long time to process in PROC IML, but the computation is "embarrassingly parallel." An embarrassingly parallel problem, is one that consists of many identical independent subtasks. The iml action supports several functions for distributing a computation to multiple threads. Examples of embarrassingly parallel computations in statistics and machine learning include Monte Carlo simulation, resampling methods such as the bootstrap, ensemble models, and many "brute force" computations.
I continue to introduce the iml action in subsequent articles. Related articles include:
- A Getting Started example that shows how to call the iml action and discusses how the action is similar to and different from PROC IML.
- How to use the iml action to read CAS tables into a matrix.
- How to write CAS tables from a matrix.
- The MAPREDUCE function, which enables you to distribute a computation across threads and nodes. A statistical application is a parallel implementation of a Monte Carlo simulation.
- The PARTASKS function, which enables you to distribute multiple independent computations across threads and nodes. A statistical application is estimating a power curve for multiple parameters in parallel.
- The SCORE function, which enables you to evaluate a function in parallel on every row of a CAS table.
For more information and examples, see Wicklin and Banadaki (2020), "Write Custom Parallel Programs by Using the iml Action," which is the basis for these blog posts. Another source is the SAS IML Programming Guide, which includes documentation and examples for the iml action.