Calling R from SAS/IML software

2

For years I've been making presentations about SAS/IML software at conferences. Since 2008, I've always mentioned to SAS customers that they can call R from within SAS/IML software. (This feature was introduced in SAS/IML Studio 3.2 and was added to the IML procedure in SAS/IML 9.22.) I also included a chapter on calling R in my book, Statistical Programming with SAS/IML Software.

However, I've never blogged about it until today. Why? Frankly, I don't usually have a reason to call R from SAS/IML software. Both R and SAS/IML are high-level languages with a rich run-time library of functions. They both enable you to extend the language by writing and sharing user-defined functions. They both enable you to use matrix computations to compactly represent and solve statistical problems and to analyze data. I use SAS/IML software for all of my day-to-day computational needs.

However, sometimes I hear about a technique that someone has implemented in R and I want to try that technique too. If I'm going to use the technique many times, I'll consider implementing it myself in the SAS/IML language. However, if I'm only going to use it once (or I'm not sure how often I'll use it), I'll save myself time and call R.

Earlier this week, I showed some plots of airline routes that a colleague created in SAS, which are based on similar plots (created in R) that appeared on the Flowing Data blog. In my blog post I said:

I don't have the time right now to implement a great circle algorithm in SAS/IML. ... I could also generate the great arcs by using the same R package that Flowing Data uses.

Basically, I was being lazy. However, after Flowing Data posted the R code to plot great arcs, I no longer had an excuse not to use great arcs. This is a situation in which calling R will save me time: I want to compute arcs of great circles, I don't know of a comparable function already written in SAS, and I'll rarely use this functionality in the future. So, I wrote a short SAS/IML program that calls an R function to compute the arcs.

Read on if you want to learn about how to call R from SAS. If you just want to see the final result, here is it (click to enlarge):

How to Call R from a SAS/IML Program

To call R, install R 2.11 or earlier1 on the same computer that runs SAS and install the R package that contains the function you want. The function that I want to call is the gcIntermediate function in the geosphere package, so I installed the package as described on the Flowing Data blog.

In general, a SAS/IML program that calls R contains four steps:

  1. Transfer data from a SAS data set or a SAS/IML matrix into a similar data structure in R.
  2. Call R by using the SUBMIT statement with the R option.
  3. Transfer results from R into a SAS data set or a SAS/IML matrix.
  4. Use the results in SAS.

I'll discuss each step in turn.

Step 1: Transfer Data from SAS to R

For ease of presentation, assume that there is a data set called DeltaFlights that contains airline routes for Delta airlines. (The general case of multiple airlines is handled similarly.) The data contains the following variables:

  • Origin_Long and Origin_Lat contain the longitude and latitude of the origin airport.
  • Dest_Long and Dest_Lat contain the longitude and latitude of the destination airport.

I can use the ExportDataSetToR subroutine to create an R data frame from a SAS data set, or I can use the ExportMatrixToR subroutine to transfer data from a SAS/IML matrix into an R matrix.

Because I like to work in the SAS/IML environment, I'll choose the second option. The following statements read the data into SAS/IML vectors or matrices:

/** requires SAS/IML 9.22 or SAS/IML Studio 3.2 **/ 
libname MyLib "C:Users...MyData";
proc iml;
use MyLib.DeltaFlights; /** 376 routes **/
  read all var {origin_long origin_lat} into mOrig;
  read all var {dest_long dest_lat} into mDest;
close MyLib.DeltaFlights;

The matrices mOrig and mDest contain 376 rows and two columns. The following statements transfer data from the two matrices into R matrices of the same dimensions:

/** copy SAS/IML matrices to R **/
run ExportMatrixToR(mOrig, "orig");
run ExportMatrixToR(mDest, "dest");

The result is two R matrices named orig and dest. Each row of orig and each row of dest contains the longitude and latitude of an airport. (The statements also start R if it is not already running.)

Step 2: Call R to Generate the Great Arcs

The call to R is straightforward:

/** get points on great arc between airports **/
submit / R;
library(geosphere)
dist <- gcIntermediate(orig, dest)
endsubmit;

The resulting R object, named dist, is a list of 376 matrices. Each matrix has 50 rows and two columns. The ith matrix represents the longitude and latitude of 50 points along a great arc that connects the ith row of orig and the ith row of dest.

Step 3: Transfer the Results

You can use the ImportMatrixFromR subroutine to copy the data from dist into a SAS/IML matrix named distance:

/** get arcs back from R **/
run ImportMatrixFromR(distance, "dist");

The distance matrix has 50 rows and 2 x 376 columns. The first two columns correspond to dist[[1]], the next two to dist[[2]], and so forth.

Step 4: Use the Results in SAS

These results are intended to be overlaid on a map. To visualize the flight paths in SAS/IML Studio, I can use the following IMLPlus statements, which are similar to the mapping examples in my 2008 SAS Global Forum Paper:

/** create map in SAS/IML Studio **/
a = shape(distance[1,], 0, 2); /** airports **/
declare ScatterPlot p;
p = ScatterPlot.Create("flights", a[,1], a[,2]);
p.DrawUseDataCoordinates();
p.DrawSetPenColor(GREY);
do i = 1 to ncol(distance)/2;
   p.DrawLine(distance[,2*i-1], distance[,2*i]);
end;
 
/** draw map in background (see SAS Global Forum
    Paper) then zoom in on US **/

However, this particular computation was for my colleague, Robert Allison, who uses SAS/GRAPH software to visualize the flight paths. Therefore, I wrote the arcs to a SAS data set and let him use PROC GMAP with the ANNOTATE= option to create the image seen earlier in this article.


1. As I've said elsewhere, R changed its directory structure between versions 2.11 and 2.12. Consequently, SAS 9.22 (which shipped before R 2.12 was released) looks for certain DLLs in directories that no longer exist. The workaround is to use R 2.11 with SAS 9.22. This workaround is not required for SAS 9.3.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of PROC IML and SAS/IML Studio. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

2 Comments

  1. Pingback: Video: Calling R from the SAS/IML Language - The DO Loop

Leave A Reply

Back to Top