Bringing the power of SAS to your Python scripts can be a game changer. An easy way to do that is by using SASPy, a Python interface to SAS allowing Python developers to use SAS® procedures within Python. However, not all SAS procedures are included in the SASPy library. So, what do you do if you want to use those excluded procedures? Easy! The SASPy library contains functionality enabling you to add SAS procedures to the SASPy library. In this post, I'll explain the process.
The basics for adding procedures are covered in the Contributing new methods section in the SASPy documentation. To further assist you, this post expands upon the steps, providing step-by-step details for adding the STDIZE procedure to SASPy. For a hands-on application of the use case refer the blog post Machine Learning with SASPy: Exploring and Preparing your data - Part 3.
This is your chance to contribute to the project! Whereas, you can choose to follow the steps below as a one-off solution, you also have the choice to share your work and incorporate it in the SASPy repository.
Prerequisites
Before you add a procedure to SASPy, you need to perform these prerequisite steps:
- Identify the SAS product associated with the procedure you want to add, e.g. SAS/STAT, SAS/ETS, SAS Enterprise Miner, etc.
- Locate the SASPy file (for example, sasstat.py, sasets.py, and so on) corresponding to the product from step 1.
- Ensure you have a current license for the SAS product in question.
Adding a SAS procedure to SASPy
SASPy utilizes Python Decorators to generate the code for adding SAS procedures. Roughly, the process is:
- define the procedure
- generate the code to add
- add the code to the proper SASPy file
- (optional)create a pull request to add the procedure to the SASPy repository
Below we'll walk through each step in detail.
Create a set of valid statements
Start a new python session with Jupyter and create a list of valid arguments for the chosen procedure. You determine the arguments for the procedure by searching for your procedure in the appropriate SAS documentation. For example, the PROC STDIZE arguments are documented in the SAS/STAT® 15.1 User's Guide, in the The STDIZE Procedure section, with the contents:
For example, I submitted the following command to create a set of valid arguments for PROC STDIZE:
lset = {'STDIZE', 'BY', 'FREQ', 'LOCATION', 'SCALE', 'VAR', 'WEIGHT'}
Call the doc_convert method
The doc_convert method takes two arguments: a list of valid statements (method_stmt) and the procedure name (stdize).
import saspy print(saspy.sasdecorator.procDecorator.doc_convert(lset, 'STDIZE')['method_stmt']) print(saspy.sasdecorator.procDecorator.doc_convert(lset, 'STDIZE')['markup_stmt']) |
The command generates the method call and the docstring markup like the following:
def STDIZE(self, data: [SASdata', str] = None, by: [str, list] = None, location: str = None, scale: str = None, stdize: str = None, var: str = None, weight: str = None, procopts: str = None, stmtpassthrough: str = None, **kwargs: dict) -> 'SASresults': Python method to call the STDIZE procedure. Documentation link: :param data: SASdata object or string. This parameter is required. :parm by: The by variable can be a string or list type. :parm freq: The freq variable can only be a string type. :parm location: The location variable can only be a string type. :parm scale: The scale variable can only be a string type. :parm stdize: The stdize variable can be a string type. :parm var: The var variable can only be a string type. :parm weight: The weight variable can be a string type. :parm procopts: The procopts variable is a generic option avaiable for advanced use It can only be a string type. :parm stmtpassthrough: The stmtpassthrough variable is a generic option available for advanced use. It can only be a string type. :return: SAS Result Object |
Update SASPy product file
We'll take the output and add it to the appropriate product file (sasstat.py in this case). When you open this file, be sure to open it with administrative privileges so you can save the changes. Prior to adding the code to the product file, perform the following tasks:
- add @procDecorator.proc_decorator({}) before the function definition
- add the proper documentation link from the SAS Programming Documentation site
- add triple quotes ("""") to comment out the second section of code
- include any additional details others might find helpful
The following output shows the final code to add to the sasstat.py file:
@procDecorator.proc_decorator({}) def STDIZE(self, data: [SASdata', str] = None, by: [str, list] = None, location: str = None, scale: str = None, stdize: str = None, var: str = None, weight: str = None, procopts: str = None, stmtpassthrough: str = None, **kwargs: dict) -> 'SASresults': """ Python method to call the STDIZE procedure. Documentation link: https://go.documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=statug&docsetTarget=statug_stdize_toc.htm&locale=en :param data: SASdata object or string. This parameter is required. :parm by: The by variable can be a string or list type. :parm freq: The freq variable can only be a string type. :parm location: The location variable can only be a string type. :parm scale: The scale variable can only be a string type. :parm stdize: The stdize variable can be a string type. :parm var: The var variable can only be a string type. :parm weight: The weight variable can be a string type. :parm procopts: The procopts variable is a generic option avaiable for advanced use It can only be a string type. :parm stmtpassthrough: The stmtpassthrough variable is a generic option available for advanced use. It can only be a string type. :return: SAS Result Object """ |
Update sasdecorator file with the new method
Alter the sasdecorator.py file by adding stdize in the code on line 29, as shown below.
if proc in ['hplogistic', 'hpreg', 'stdize']: |
Important: The update to the sasdecorator file is only a requirement when you add a procedure with no plot options. The sasstat.py library assumes all procedures produce plots. However, PROC STDIZE does not include them. So, you should perform this step ONLY when your procedure does not include plot options. This will more than likely change in a future release, so please follow the Github page for any updates.
Document a test for your function
Make sure you write at least one test for the procedure. Then, add the test to the appropriate testing file.
Finally
Congratulations! All done. You now have the knowledge to add even more procedures in the future.
After you add your procedure, I highly recommend you contribute your procedure to the SASPy GitHub library. To contribute, follow the outlined instructions on the Contributing Rules GitHub page.