For years I've been making presentations about SAS/IML software at conferences. Since 2008, I've always mentioned to SAS customers that they can call R from within SAS/IML software. (This feature was introduced in SAS/IML Studio 3.2 and was added to the IML procedure in SAS/IML 9.22.) I also included a chapter on calling R in my book, Statistical Programming with SAS/IML Software.
However, I've never blogged about it until today. Why? Frankly, I don't usually have a reason to call R from SAS/IML software. Both R and SAS/IML are high-level languages with a rich run-time library of functions. They both enable you to extend the language by writing and sharing user-defined functions. They both enable you to use matrix computations to compactly represent and solve statistical problems and to analyze data. I use SAS/IML software for all of my day-to-day computational needs.
However, sometimes I hear about a technique that someone has implemented in R and I want to try that technique too. If I'm going to use the technique many times, I'll consider implementing it myself in the SAS/IML language. However, if I'm only going to use it once (or I'm not sure how often I'll use it), I'll save myself time and call R.
Earlier this week, I showed some plots of airline routes that a colleague created in SAS, which are based on similar plots (created in R) that appeared on the Flowing Data blog. In my blog post I said:
I don't have the time right now to implement a great circle algorithm in SAS/IML. ... I could also generate the great arcs by using the same R package that Flowing Data uses.
Basically, I was being lazy. However, after Flowing Data posted the R code to plot great arcs, I no longer had an excuse not to use great arcs. This is a situation in which calling R will save me time: I want to compute arcs of great circles, I don't know of a comparable function already written in SAS, and I'll rarely use this functionality in the future. So, I wrote a short SAS/IML program that calls an R function to compute the arcs.
Read on if you want to learn about how to call R from SAS. If you just want to see the final result, here is it (click to enlarge):
How to Call R from a SAS/IML Program
To call R, install R 2.11 or earlier^{1} on the same computer that runs SAS and install the R package that contains the function you want. The function that I want to call is the gcIntermediate function in the geosphere package, so I installed the package as described on the Flowing Data blog.
In general, a SAS/IML program that calls R contains four steps:
- Transfer data from a SAS data set or a SAS/IML matrix into a similar data structure in R.
- Call R by using the SUBMIT statement with the R option.
- Transfer results from R into a SAS data set or a SAS/IML matrix.
- Use the results in SAS.
I'll discuss each step in turn.
Step 1: Transfer Data from SAS to R
For ease of presentation, assume that there is a data set called DeltaFlights that contains airline routes for Delta airlines. (The general case of multiple airlines is handled similarly.) The data contains the following variables:
- Origin_Long and Origin_Lat contain the longitude and latitude of the origin airport.
- Dest_Long and Dest_Lat contain the longitude and latitude of the destination airport.
I can use the ExportDataSetToR subroutine to create an R data frame from a SAS data set, or I can use the ExportMatrixToR subroutine to transfer data from a SAS/IML matrix into an R matrix.
Because I like to work in the SAS/IML environment, I'll choose the second option. The following statements read the data into SAS/IML vectors or matrices:
/** requires SAS/IML 9.22 or SAS/IML Studio 3.2 **/ libname MyLib "C:Users...MyData"; proc iml; use MyLib.DeltaFlights; /** 376 routes **/ read all var {origin_long origin_lat} into mOrig; read all var {dest_long dest_lat} into mDest; close MyLib.DeltaFlights; |
The matrices mOrig and mDest contain 376 rows and two columns. The following statements transfer data from the two matrices into R matrices of the same dimensions:
/** copy SAS/IML matrices to R **/ run ExportMatrixToR(mOrig, "orig"); run ExportMatrixToR(mDest, "dest"); |
The result is two R matrices named orig and dest. Each row of orig and each row of dest contains the longitude and latitude of an airport. (The statements also start R if it is not already running.)
Step 2: Call R to Generate the Great Arcs
The call to R is straightforward:
/** get points on great arc between airports **/ submit / R; library(geosphere) dist <- gcIntermediate(orig, dest) endsubmit; |
The resulting R object, named dist, is a list of 376 matrices. Each matrix has 50 rows and two columns. The ith matrix represents the longitude and latitude of 50 points along a great arc that connects the ith row of orig and the ith row of dest.
Step 3: Transfer the Results
You can use the ImportMatrixFromR subroutine to copy the data from dist into a SAS/IML matrix named distance:
/** get arcs back from R **/ run ImportMatrixFromR(distance, "dist"); |
The distance matrix has 50 rows and 2 x 376 columns. The first two columns correspond to dist[[1]], the next two to dist[[2]], and so forth.
Step 4: Use the Results in SAS
These results are intended to be overlaid on a map. To visualize the flight paths in SAS/IML Studio, I can use the following IMLPlus statements, which are similar to the mapping examples in my 2008 SAS Global Forum Paper:
/** create map in SAS/IML Studio **/ a = shape(distance[1,], 0, 2); /** airports **/ declare ScatterPlot p; p = ScatterPlot.Create("flights", a[,1], a[,2]); p.DrawUseDataCoordinates(); p.DrawSetPenColor(GREY); do i = 1 to ncol(distance)/2; p.DrawLine(distance[,2*i-1], distance[,2*i]); end; /** draw map in background (see SAS Global Forum Paper) then zoom in on US **/ |
However, this particular computation was for my colleague, Robert Allison, who uses SAS/GRAPH software to visualize the flight paths. Therefore, I wrote the arcs to a SAS data set and let him use PROC GMAP with the ANNOTATE= option to create the image seen earlier in this article.
1. As I've said elsewhere, R changed its directory structure between versions 2.11 and 2.12. Consequently, SAS 9.22 (which shipped before R 2.12 was released) looks for certain DLLs in directories that no longer exist. The workaround is to use R 2.11 with SAS 9.22. This workaround is not required for SAS 9.3.
2 Comments
Fantastic work to combine powers between SAS and R! Thanks a lot.
Pingback: Video: Calling R from the SAS/IML Language - The DO Loop