Good news learners! SAS University Edition has gone back to school and learned some new tricks.
With the December 2017 update, SAS University Edition now includes the SASPy package, available in its Jupyter Notebook interface. If you're keeping track, you know that SAS University Edition has long had support for Jupyter Notebook. With that, you can write and run SAS programs in a notebook-style environment. But until now, you could not use that Jupyter Notebook to run Python programs. With the latest update, you can -- and you can use the SASPy library to drive SAS features like a Python coder.
Oh, and there's another new trick that you'll find in this version: you can now use SAS (and Python) to access data from HTTPS websites -- that is, sites that use SSL encryption. Previous releases of SAS University Edition did not include the components that are needed to support these encrypted connections. That's going to make downloading web data much easier, not to mention using REST APIs. I'll show one HTTPS-enabled example in this post.
How to create a Python notebook in SAS University Edition
When you first access SAS University Edition in your web browser, you'll see a colorful "Welcome" window. From here, you can (A) start SAS Studio or (B) start Jupyter Notebook. For this article, I'll assume that you select choice (B). However, if you want to learn to use SAS and all of its capabilities, SAS Studio remains the best method for doing that in SAS University Edition.
When you start the notebook interface, you're brought into the Jupyter Home page. To get started with Python, select New->Python 3 from the menu on the right. You'll get a new empty Untitled notebook. I'm going to assume that you know how to work with the notebook interface and that you want to use those skills in a new way...with SAS. That is why you're reading this, right?
Move data from a pandas data frame to SAS
pandas is the standard for Python programmers who work with data. The pandas module is included in SAS University Edition -- you can use it to read and manipulate data frames (which you can think of like a table). Here's an example of retrieving a data file from GitHub and loading it into a data frame. (Read more about this particular file in this article. Note that GitHub uses HTTPS -- now possible to access in SAS University Edition!)
import saspy import pandas as pd df = pd.read_csv('https://raw.githubusercontent.com/zonination/perceptions/master/probly.csv') df.describe()
Here's the result. This is all straight Python stuff; we haven't started using any SAS yet.
Before we can use SAS features with this data, we need to move the data into a SAS data set. SASPy provides a dataframe2sasdata() method (shorter alias: df2sd) that can import your Python pandas data frame into a SAS library and data set. The method returns a SASdata object. This example copies the data into WORK.PROBLY in the SAS session:
sas = saspy.SASsession() probly = sas.df2sd(df,'PROBLY') probly.describe()
The SASdata object also includes a describe() method that yields a result that's similar to what you get from pandas:
Drive SAS procedures with Python
SASPy includes a collection of built-in objects and methods that provide APIs to the most commonly used SAS procedures. The APIs present a simple "Python-ic" style approach to the work you're trying to accomplish. For example, to create a SAS-based histogram for a variable in a data set, simply use the hist() method.
SASPy offers dozens of simple API methods that represent statistics, machine learning, time series, and more. You can find them documented on the GitHub project page. Note that since SAS University Edition does not include all SAS products, some of these API methods might not work for you. For example, the SASml.forest() method (representing the HPFOREST procedure) works only when you have SAS Enterprise Miner. (And no, that's not included in SAS University Edition.)
"Teach me SAS": SAS code from Python code
In SASPy, all methods generate SAS program code behind the scenes. If you like the results you see and want to learn the SAS code that was used, you can flip on the "teach me SAS" mode in SASPy.
Here's what SASPy reveals about the describe() and hist() methods we've already seen:
If you want to experiment with SAS statements that you've learned, you don't need to leave the current notebook and start over. There's also a built-in %%SAS "magic command" that you can use to try out a few of these SAS statements.
%%SAS proc means data=sashelp.cars stackodsoutput n nmiss median mean std min p25 p50 p75 max;run;
Python limitations in SAS University Edition
SAS University Edition includes over 300 Python modules to support your work in Jupyter Notebook. To see a complete list, run the help('modules') command from within a Python notebook. This list includes the common Python packages required to work with data, such as pandas and NumPy. However, it does not include any of the popular Python-based machine learning modules, nor any modules to support data visualization. Of course, SASPy has support for most of this within its APIs, so why would you need anything else...right?
Because SAS University Edition is packaged in a virtual machine that you cannot alter, you don't have the option of installing additional Python modules. You also don't have access to the Jupyter terminal, which would allow you to control the system from a shell-like interface. All of this is possible (and encouraged) when you have your own SAS installation with your own instance of SASPy. It's all waiting for you when you've outgrown the learning environment of SAS University Edition and you're ready to apply your SAS skills and tech to your official work!
- How to code in Python with SAS 9.4 - Chevell Parker (SAS Tech Support Rock Star) shares the basic "How to" for getting started with SASPy.
- Introducing SASPy: Using Python code to access SAS - Includes a video demo from Jared Dean (SASPy developer) that shows off some of the cool capabilities in SASPy.