Extending SAS: How to define new functions in PROC FCMP and SAS/IML software

12

SAS software provides many run-time functions that you can call from your SAS/IML or DATA step programs. The SAS/IML language has several hundred built-in statistical functions, and Base SAS software contains hundreds more. However, it is common for statistical programmers to extend the run-time library to include special user-defined functions.

In a previous blog post I discussed two different ways to apply a log transformation when your data might contain missing values and negative values. I'll use the log transformation example to show how to define and call user-defined functions in SAS/IML software and in Base SAS software.

A "safe" log transformation in the SAS/IML language

In the SAS/IML language, it is easy to write user-defined functions (called modules) that extend the functionality of the language. If you need a function that safely takes the natural logarithm and handles missing and negative values, you can easily use the ideas from my previous blog post to create the following SAS/IML function:

proc iml;
/* if Y>0, return the natural log of Y
   otherwise return a missing value  */
start SafeLog(Y);
   logY = j(nrow(Y),ncol(Y),.); /* allocate missing */
   idx = loc(Y > 0);            /* find indices where Y > 0 */
   if ncol(idx) > 0 then logY[idx] = log(Y[idx]);
   return(logY);
finish;
 
Y = {-3,1,2,.,5,10,100}; 
LogY = SafeLog(Y);
print Y LogY;

The program is explained in my previous post, but essentially it allocates a vector of missing values and then computes the logarithm for the positive data values. The START and FINISH statements are used to define the SafeLog function, which you can then call on a vector or matrix of values.

In this example, the function is defined only for the current PROC IML session. However, you can store the function and load it later if you want to reuse it.

Defining a "safe" log transformation by using PROC FCMP

You can also extend the Base SAS library of run-time functions. The FCMP procedure enables you to define your own functions that can be called from the DATA step and from other SAS procedures. (The MCMC procedure has an example of calling a user-defined function from a SAS/STAT procedure.) If you have never used the FCMP procedure before, I recommend Peter Eberhardt's 2009 paper on defining functions in PROC FCMP. For a more comprehensive treatment, see Jason Secosky's 2007 paper.

Technically, you don't need to do anything special in the DATA step if you want a SAS missing value to represent the logarithm of a negative number: the DATA step does this automatically. However, the DATA step also generates some scary-looking notes in the SAS LOG:

NOTE: Invalid argument to function LOG at line 72 column 5.
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+
74         -3 1 2 . 5 10 100
x=-3 y=. _ERROR_=1 _N_=1
NOTE: Missing values were generated as a result of performing an operation on missing values.
NOTE: Mathematical operations could not be performed at the following places. The results of
      the operations have been set to missing values.

I prefer my programs to run with a clean, healthy-looking SAS LOG, so I will use PROC FCMP to define a SafeLog function that has the same behavior (and name!) as my SAS/IML function:

proc fcmp outlib=work.funcs.MathFuncs;
function SafeLog(y);
   if y>0 then return( log(y) );
   return( . );
endsub;
quit;

The function returns a missing value for nonpositive and missing values. The definition of the function is stored in a data set named WORK.FUNCS, which will vanish when you exit SAS. However, you can create the definition in a permanent location if you want to call the function in a later SAS session.

In order to call the function from the DATA step, use the CMPLIB= option, as shown in the following example:

options cmplib=work.funcs;  /* define location of SafeLog function */
data A;
input x @@;
y = SafeLog(x); /* test the function */
datalines;
-3 1 2 . 5 10 100 
;

The result is not shown, but it is identical to the output from the SAS/IML program.

You might not have need for the SafeLog function, but it is very useful to know how to define user-defined functions in SAS/IML software and in Base SAS software. SAS/IML modules and PROC FCMP functions make it easy to extend the built-in functionality of SAS software.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

12 Comments

  1. charlie huang on

    The other way --
    proc fcmp outlib=work.funcs.MathFuncs;
    function SafeLog(y);
    return( ifn(y>0, log(abs(y)), .) );
    endsub;
    quit;

  2. Hi, Rick. I have a quick question and I have been struggling for several days. I hope you could help me. Many thanks.

    I hope to design a vector as follows: suppose there are two given vector a=[2 3 5], b=[1 2 3]
    Here b indicates the frequency vector. I hope to generate a vector c=[2 3 3 5 5 5]. I am trying to use the repeat function in IML, but there is always something wrong.

    Thanks

  3. As I understand it the method outlined above only allows SAS to be extended using SAS languages. Using this method allows users to write PROCs in SAS and have them automatically load as part of their environment so that they don't have to cut and paste their favorite PROCs ending up with multiple versions scattered across multiple files.

    Is there an easy way to extend SAS from other programming languages like c++ or C#? I am involved in a project where the data sources and destinations will be using a proprietary streaming messaging service similar to JMS or MQ. The messaging service will expose APIs for C# and c++. The use case is that the SAS analyst will be able to pull data in that originated remotely using a pub/sub or request/response messaging. I as the developer need to provide a SAS friendly interface that is callable from base SAS that the analyst can use to pull in the data and then transform it in their SAS program and then push the transformed data back out using another. We only need to support this capability on Windows.

    I imagined being able to write a Windows DLL or service in either c++ or C# and maybe using some of the methods above that DLL registers itself and is automatically loaded into the user's SAS environment. Have you heard of anyone getting something like this to work? If so do you have any links describing the process?

    I have been reading about the SAS Integrated Object Model IOM (activeX/COM) support in SAS v9 and the SAS MetaData Server support but these appear geared towards exposing SAS functionality towards Windows clients programs. From the little reading I have done so far this appears to be the opposite of what I am trying to achieve. Any pointers in the right direction would be greatly appreciated, thank you.

    • Chris Hemedinger
      Chris Hemedinger on

      Frank, there are several ways that you can call 3rd party applications or modules from within SAS.

      SAS can call modules at the DLL level with CALL MODULE. You might be able to find good practical examples in SAS conference proceedings.

      Also, SAS can call Web services via PROC SOAP or PROC HTTP. If you can implement a service layer (assuming it makes sense for your application architecture), this might be a workable approach.

      SAS offers integration with JMS and MSMQ. But if your service is proprietary and doesn't adhere to these standards, this approach might not be appropriate.

      Finally, you could use a hybrid approach where you build a client app that "connects" SAS with the service you're consuming. For example, you could use SAS Integration Technolgies APIs and the message-service APIs from a custom application, where your custom application acts as a go-between -- fetching data and prepping it for SAS processing, then taking SAS output and piping it back to the message-service. Depending on the volume of data that might not be the most efficient approach. I have an example of a C# app that uses the SAS Integration Technologies APIs here.

  4. I am an experienced developer just starting to learn SAS. I find it odd that there needs to be a blog post to describe how to create a function. That seems like absolute day 1 material that every SAS programmer should know. But perhaps I just don't understand the SAS community/environment yet.

    • Rick Wicklin

      Welcome to SAS! Thanks for the comment. I sometimes post about elementary topics because everyone is a beginner at some point. With regard to this post, the FCMP procedure was not introduced until SAS 9.2. Some experienced programmers have been programming in SAS for 25 years or more, but are slow to adopt newer techniques.

  5. Pingback: Trap and cap: Avoid division-by-zero and domain errors when evaluating functions - The DO Loop

  6. Pingback: The Babylonian method for finding square roots by hand - The DO Loop

  7. Pingback: The IFN function versus the IF-THEN/ELSE statement in SAS - The DO Loop

  8. Pingback: The arithmetic-geometric mean - The DO Loop

  9. Pingback: Implement the Gumbel distribution in SAS - The DO Loop

Leave A Reply

Back to Top