We've just celebrated Earth Day, but I'm here to talk about Jupyter -- and the SAS open source project that opens the door for more learning. With this new project on the github.com/sassoftware page, SAS contributes new support for running SAS from within Jupyter Notebooks -- a popular browser-based environment used by professors and data scientists.
My colleague Amy Peters announced this during a SAS Tech Talk show at SAS Global Forum 2016. If you want to learn more about Jupyter and see the SAS support in action, then you can watch the video here.
Visit the project on GitHub: sas_kernel by sassoftware
Within Jupyter, the sas_kernel provides multiple ways to access SAS programming methods. The most natural method is to create a new SAS notebook, available from the New menu in the Jupyter Home window and from the File menu in an active notebook:
From a SAS notebook, you can enter and run SAS code directly from a cell:
There is even a Notebook extension (./nbextensions/showSASLog) that can show you the SAS log.
The second way that you can run SAS code is by using special Jupyter "magics" supported by the sas_kernel. These magic commands look almost just like SAS macro calls (imagine that!). From within a Python language notebook, you can inject your SAS program code and pull in SAS results. This allows you to move easily between Python and SAS in a single environment. Here's a simple example:
%%SAS proc means data=sashelp.cars; run; ods graphics / height=500 width=800; proc sgplot data=sashelp.cars; histogram msrp; run; |
How to get started
Here's what you need to run SAS with Jupyter:
- SAS 9.4 or later running on Windows, Linux, or even z/OS (see support for SASPy, the underlying package)
- Python 3 installed on the same machine (that's basically part of Linux)
- Access to the OS shell to install/configure the Jupyter Notebook infrastructure and the sas_kernel.
End users of Jupyter Notebook do not need special privileges. And you can access it from a browser on any system: Windows, Mac, Linux...whatever. In the SAS Tech Talk video with Amy, we were running on my Windows laptop using Chrome, connecting to a Linux instance of Jupyter and SAS. The GitHub project has all of the doc and step-by-step instructions for installation.
What's next for SAS and Jupyter?
This is just the start for SAS in the Jupyter world. Amy says that she has already received lots of interest and feedback. Stay tuned!
45 Comments
One of the best features to come out for SAS recently! Looking forward to being able to try this.
Great to hear!
Please let me know how your experience goes.
Best,
Jared
I wonder if there will be any possibility (some sort of add-inns or upgrade) in a way that the same job can be done by SAS 9.3.
I think that the current sas_kernel relies on a few features that are in SAS 9.4 only. I don't know if there are any plans to back-port to 9.3.
@Mahdi,
The reason for the SAS 9.4 requirement is because the sas_kernel returns results to Jupyter using the ODS HTML5 engine (http://support.sas.com/documentation/cdl/en/odsug/67921/HTML/default/viewer.htm#p0hcv8gpxqebnpn1is52we2enltx.htm)
There would have to be investigation of how to render results using an engine that shipped with SAS 9.3 You can enter an issue on GitHub (https://github.com/sassoftware/sas_kernel/issues) but I think many SAS users would encourage you to upgrade to SAS 9.4 so that you can take advantage of all the features and improvements over the last 5+ years.
Hi,
Great news, but does it (or when) support windows?
Cheers!
I think we heard this request several times at SAS Global Forum. I know that Windows support is under consideration; I don't know the timetable.
@Snorre,
As Chris said we are evaluating this request but I don't have a timetable as of yet.
Jared
I want to second the motion to make this work on windows. On a lark I googled "sas kernal jupyter notebook" and found this recent announcement. I am very excited. I use python + jupyter to memorialize my work ("ask the data questions and write down the answers") and have been wishing for the same thing with SAS.
Hi, its almost 6 months since this was announced. Any news on support for Windows?
Hi Will, nothing to share yet. I did have a conversation with the developers last week, and they were looking into an approach...but it's premature to say when it will happen.
The Linux requirement is a bit odd, since the Jupyter kernel could control SAS on Windows via the .NET interface using pywin32 or a similar module. I'll have to look at the source to see how the kernel connects to SAS.
I think the team worked on Linux first, as that's a popular multi-user SAS platform where Jupyter can have a positive impact on teams. And also the SAS University Edition is delivered with a packaged Linux VM, so this will be a natural fit...eventually. I think Windows support is inevitable, just wasn't the focus right out of the gate. I suspect the mechanism will need to be a little different for Windows.
Wondering if it could be implemented in SDD as well...
Bjorn
Hi Bjorn, I'd guess the answer is Yes, on a technical level. However, SDD has some pretty strict auditing/authentication/process requirements, so I'm not sure if it's fit for the purpose. If you're working with a SAS-hosted SDD environment, it might be worth asking your SAS rep about this possibility.
Pingback: Choose your own adventure with SAS Viya - SAS Users
Pingback: Reproducible research: Is my SAS code enough? - A Shot in the Arm
How does this work if I have SAS grid and Jupyter is on remote server? In short, if I have server based environments as opposed to SAS and Jupyter software on my pc?
Hi Kevin, great question. I think that the way this works right now it doesn't play into the Grid environment. That is, you have Jupyter hosted on a single node where you have SAS available, and it launches/communicates with a single SAS process. I'll ping the developer (Jared) to see if he has anything to add about this.
Kevin,
Currently the SAS Kernel can't take advantage of SAS Grid. Other organizations have groups connecting to grid nodes directly through SSH which allows a centralized Jupyter server that is located apart from SAS. You can see setup details from the inline comments here: https://github.com/sassoftware/saspy/blob/master/saspy/sascfg.py
If SAS Grid and Jupyter interaction is a strong business need, I would encourage you to enter an issue at https://github.com/sassoftware/sas_kernel/issues
Pingback: Using Jupyter and SAS together with SAS University Edition - The SAS Dummy
Pingback: Reporting on GitHub accounts with SAS - The SAS Dummy
Pingback: A journey of SAS high performance - SAS Users
Pingback: Reproducible research: Is my SAS code enough? - SAS Voices
Chris, I currently teach a Statistical Computing course using SAS, I migrated myself and my students over to using SAS EG as a programming interface, but I am wondering if this is the best way to go? Do you have a sense of what is being used as the programming interface in industry?
Should I be switching to a notebook approach, or is SAS studio a better way to go?
I have multiple courses I teach and I will be developing a first course for a new Data Science masters that uses SAS and I wonder if I should be using SAS-EG, SAS Studio or learning these notebooks? I have noticed that workflow is something students need to learn, instead of figuring it out over the years and that the programming interface has an impact on learning and I wanted to get a sense of which method to use as I go forward.
Thanks.
Laura
Laura, these are great questions. A comprehensive program might use all of these (and more) for various aspects of data science (including data acquisition and prep, analysis, and reporting/publishing). But rather than hear it from me, I'd encourage you to work with one of our curriculum consultants. We have a brand-new resource page with more information for educators.
You might also want to check out the new SAS Data Science Academy -- see how that's put together. (I tested part of it -- it is crazy comprehensive.)
Chris, any updates on support for Windows?
Yes! It's there now with the latest updates for SASPy. I've been able to run this with my SAS and Anaconda Python installed on Windows.
No SAS libraries or datasets to be explored (don't make mistakes), no syntax highlighting (don't make mistakes), the log will appear in a squished box with a scroll bar (don't make mistakes). Another tiny scroll box for results (don't worry, I'm sure your results are fine).
Just ONE of the incredible replacements for Enterprise Guide coming to a SAS shop near you!
Sarcasm duly noted ;)
Well, look, Jupyter Notebook is not for everyone, and for some of us who are accustomed to more sophisticated, interactive environments -- it does seem like a step backward into the console-only days. I think of this as the hipster coding environment. And if there is one thing that I know about hipsters, it's that they don't like it when tasks are too easy or straightforward. There's no school like the old school, after all.
However, Jupyter Notebook does have support for dozens of popular coding languages, and it treats them all in a similar (albeit simple) way. So it's one more option for Jupyter Notebook users who find themselves needing SAS analytics.
Chris,
Could we get a copy of that example notebook Amy Peters shows in the video (around 2:35 in)?
Thanks!
Ooh, a deep cut there. I found a copy of the notebook code and placed it in a public gist. It's not complete -- no output and missing a referenced image. But hope it works for your purpose.
Thanks for this! You do not seem to be able to use macros though. Is this correct? Cannot use this code:
%macro SomeExcitingMacro(OriginName);
PROC SGPLOT DATA=SASHELP.cars(where=(Origin = "&OriginName."));
VBOX Invoice
/ category = DriveTrain nooutliers;
yaxis min=0 max=100000;
RUN;
%mend SomeExcitingMacro;
proc sql;
create table LoopData as
select
distinct Origin
from SASHELP.cars
;
quit;
data _null_;
set LoopData;
call execute('%nrstr(%SomeExcitingMacro('||Origin||'))');
run;
Any SAS code should work, including macros. The % and %% chars have special meaning in Jupyter, so something might need to be escaped there. I'll have to test this particular program -- but in general, it should work.
I have found that inserting "data _null_; run;" ahead of a new macro definition will eliminate the above issue.
Hi,I find a problem in my jupyter. I cannot solve it.
when I submit this code.
%%SAS
proc print data=sashelp.class;
run;
data work.a;
set sashelp.cars;
run;
the result shows?
SAS ϵͳ
Obs Name Sex Age Height Weight
1 °¢¶û¸¥À׵ ÄÐ 14 69.0 112.5
2 °®ÀöË¿ Å® 13 56.5 84.0
3 °Å°ÅÀ Å® 13 65.3 98.0
4 ¿Â¶ Å® 14 62.8 102.5
5 ºàÀû ÄÐ 14 63.5 102.5
6 ղķ˹ ÄÐ 12 57.3 83.0
7 ¼ò Å® 12 59.8 84.5
8 ÑÅÄÝÌØ Å® 15 62.5 112.5
This looks like you might be using a Chinese version of SAS, but running with an encoding that doesn't match with Jupyter. The best practice is to use UTF8 Encoding options when you launch SAS, which you can configure in your SAS startup command or in your sasv9.cfg.
But the data is English.
Name Sex Age Height Weight
Alfred M 14 69 112.5
Alice F 13 56.5 84
Barbara F 13 65.3 98
Carol F 14 62.8 102.5
Henry M 14 63.5 102.5
James M 12 57.3 83
Jane F 12 59.8 84.5
Janet F 15 62.5 112.5
Jeffrey M 13 62.5 84
John M 12 59 99.5
Joyce F 11 51.3 50.5
Judy F 14 64.3 90
Louise F 12 56.3 77
Mary F 15 66.5 112
Philip M 16 72 150
Robert M 12 64.8 128
Ronald M 15 67 133
Thomas M 11 57.5 85
William M 15 66.5 112
I try use my utf8 sasv9.cfg to replace zh sasv9.cfg. But the problem still exist.
I only find the sasv9.cfg in "C:\Program Files\SASHome\SASFoundation\9.4\nls".
Is this the file location?
Some sample data that ships with SAS is localized, and can be different if using a different locale. Check the PROC CONTENTS for the SASHELP.CLASS to verify that the file is coming from an English directory. Use PROC OPTIONS to see where your config file is being read from. The sasv9.cfg is either in the same path as SAS.exe (and that will be a small file that simply references a locale-specific config), or in a larger SAS deployment this will be in your Config/Lev1 folder.
Thank you for your help. When I modify my cfg. The problem is disappear. I can see the English data.
But I have another problem, sorry.
%%SAS
ods html5 style = statistical;
ods graphics /width =500 height =400;
proc means data=sashelp.cars;run;
the tile is that doesn't match with Jupyter.
like ��
æ ‡ç¾
I cannot send my screenshot to you..
Try posting this question in the SAS Programming board on SAS Support Communities. You can include your screenshot.
Thank you for your help. I think my sas account have a problem. I will help to SASProfileHelp@sas.com and ask this question on SAS Support Communities. thank you.
Thanks so much for this post. I have gotten SAS to run in Jupyter notebooks. One thing that is confusing me - I ran a series of linear mixed models (proc mixed) followed by multiple testing correction (proc multtest) in both SAS (Windows) and in SAS in Jupyter notebooks and while the results are similar, the raw p-values are slightly different (e.g. 0.0103 vs. 0.0111) and the FDR q-values are quite different (0.26926 vs. 0.11848). I'm confused by why I would be getting different results when running the same code through Jupyter notebook compared with through SAS directly.
Whenever you see different results in different SAS environments, first thing to check is system options. You can run PROC OPTIONS in both environments and compare -- but that's a big list to look at. Some frequent culprits are options related to missing values, variable names (validvarname = ANY vs V7).
Will the SAS Log NBExtension work when using the %%SAS approach in the Python notebook? It does not appear to be the case, but I just wanted to make sure that I wasn't missing something (it works fine for me when programming within a SAS notebook).