Thanks to a new open source project from SAS, Python coders can now bring the power of SAS into their Python scripts. The project is SASPy, and it's available on the SAS Software GitHub. It works with SAS 9.4 and higher, and requires Python 3.x.
I spoke with Jared Dean about the SASPy project. Jared is a Principal Data Scientist at SAS and one of the lead developers on SASPy and a related project called Pipefitter. Here's a video of our conversation, which includes an interactive demo. Jared is obviously pretty excited about the whole thing.
Use SAS like a Python coder
SASPy brings a "Python-ic" sensibility to this approach for using SAS. That means that all of your access to SAS data and methods are surfaced using objects and syntax that are familiar to Python users. This includes the ability to exchange data via pandas, the ubiquitous Python data analysis framework. And even the native SAS objects are accessed in a very "pandas-like" way.
import saspy import pandas as pd sas = saspy.SASsession(cfgname='winlocal') cars = sas.sasdata("CARS","SASHELP") cars.describe()
The output is what you expect from pandas...but with statistics that SAS users are accustomed to. PROC MEANS anyone?
In: cars.describe() Out: Variable Label N NMiss Median Mean StdDev \ 0 MSRP . 428 0 27635.0 32774.855140 19431.716674 1 Invoice . 428 0 25294.5 30014.700935 17642.117750 2 EngineSize . 428 0 3.0 3.196729 1.108595 3 Cylinders . 426 2 6.0 5.807512 1.558443 4 Horsepower . 428 0 210.0 215.885514 71.836032 5 MPG_City . 428 0 19.0 20.060748 5.238218 6 MPG_Highway . 428 0 26.0 26.843458 5.741201 7 Weight . 428 0 3474.5 3577.953271 758.983215 8 Wheelbase . 428 0 107.0 108.154206 8.311813 9 Length . 428 0 187.0 186.362150 14.357991 Min P25 P50 P75 Max 0 10280.0 20329.50 27635.0 39215.0 192465.0 1 9875.0 18851.00 25294.5 35732.5 173560.0 2 1.3 2.35 3.0 3.9 8.3 3 3.0 4.00 6.0 6.0 12.0 4 73.0 165.00 210.0 255.0 500.0 5 10.0 17.00 19.0 21.5 60.0 6 12.0 24.00 26.0 29.0 66.0 7 1850.0 3103.00 3474.5 3978.5 7190.0 8 89.0 103.00 107.0 112.0 144.0 9 143.0 178.00 187.0 194.0 238.0
SASPy also provides high-level Python objects for the most popular and powerful SAS procedures. These are organized by SAS product, such as SAS/STAT, SAS/ETS and so on. To explore, issue a dir() command on your SAS session object. In this example, I've created a sasstat object and I used dot<TAB> to list the available SAS analyses:
SASPy provides Python access to all of the features that your SAS license allows. The SAS Pipefitter project extends the SASPy project by providing a high-level API for building analytical pipelines. With SAS Pipefitter, you can easily create repeatable workflows that feature advanced analytics and machine learning algorithms. In our video interview, Jared presents a cool example of a decision tree applied to the passenger survival factors on the Titanic. It's powered by PROC HPSPLIT behind the scenes, but Python users don't need to know all of that "inside baseball."
Installing SASPy and getting started
Like most things Python, installing the SASPy package is simple. You can use the pip installation manager to fetch the latest version:
pip install saspy
The configuration steps will vary depending on your SAS environment. The connectivity options support an impressively diverse set of SAS configs: Windows, Unix, SAS Grid Computing, and even SAS on the mainframe! All of this is documented in the "Installation and Configuration" section of the project documentation.
If you're new to Python but well-versed in SAS, I have a recommendation for you. Two SAS and Python enthusiasts -- Isaiah Lankam and Matthew Slaughter -- have created a tutorial that shows how to use SAS (via SASPy) in Python applications. Isaiah and Matthew explain some of the Python basics and relate them to SAS concepts, then they show how to put it all together.
Download, comment, contribute
SASPy is an open source project, and all of the Python code is available for your inspection and improvement. The developers at SAS welcome you to give it a try and enter issues when you see something that needs to be improved. And if you're a hotshot Python coder, feel free to fork the project and issue a pull request with your suggested changes!