This article and accompanying technical white paper are written to help SAS 9 users process existing SAS 9 code multi-threaded in SAS Viya 3.3.
Read the full paper, Getting Your SAS 9 Code to Run Multi-Threaded in SAS Viya 3.3.
The Future is Multi-threaded Processing Using SAS® Viya®
When I first began researching how to run SAS 9 code as multi-threaded in SAS Viya, I decided to compile all the relevant information and detail the internal processing rules that guide how the SAS Viya Cloud Analytics Services (CAS) server handles code and data. What I found was that there are a simple set of guidelines, which if followed, help ensure that most of your existing SAS 9 code will run multi-threaded. I have stashed a lot of great information into a single whitepaper that is available today called “Getting Your SAS 9 Code to Run Multi-threaded in SAS Viya”.
Starting with the basic distinctions between single and parallel processing is key to understanding why and how some of the parallel processing changes have been implemented. You see, SAS Viya technology constructs code so that everything runs in pooled memory using multiple processors. Redefining SAS for this parallel processing paradigm has led to huge gains in decreasing program run-times, as well as concomitant increases in accuracy for a variety of machine learning techniques. Using SAS Viya products helps revolutionize how we think about undertaking large-scale work because now we can complete so many more tasks in a fraction of the time it took before.
The new SAS Viya products bring a ton of value compared to other choices you might have in the analytics marketplace. Unfortunately most open source libraries and packages, especially those developed for use in Python and R, are limited to single-threading. SAS Viya offers a way forward by coding in these languages using an alternative set of SAS objects that can run as parallel, multi-threaded, distributed processes. The real difference is in the shared memory architecture, which is not the same as parallel, distributed processing that you hear claimed from most Hadoop and cloud vendors. Even though parallel, distributed processing is faster than single-threading, it proverbially hits a performance wall that is far below what pooled and persisted data provides when using multi-threaded techniques and code.
For these reasons, I believe that SAS Viya is the future of data/decision science, with shared memory running against hundreds if not thousands of processors, and returning results back almost instantaneously. And it’s not for just a handful of statistical techniques. I’m talking about running every task in the analytics lifecycle as a multi-threaded process, meaning end-to-end processing through data staging, model development and deployment, all potentially accomplished through a point-and-click interface if you choose. Give SAS Viya a try and you will see what I am talking about. Oh, and don’t forget to read my technical white paper that provides a checklist of all the things that you may need to consider when transitioning your SAS 9 code to run multi-threaded in SAS Viya!
Questions or Comments?
If you have any questions about the details found in this paper, feel free to leave them in the comments field below.
1 Comment
Highly appreciate your blog and paper for deeper insight into SAS 9 and SAS Viya programming review and best practices.
Programming differences, changes, replacement and execution are well explained and helpful in order to understand the migration from old sas to the new
world of sas multi-threaded, in-memory program execution.
Regarding R & Python, Anaconda for cluster management provides resource management tools to deploy Anaconda across a cluster. It can manage multiple conda environments and packages (including Python and R) on bare-metal or cloud-based clusters. Supported platforms include Amazon-virtual machines. It is not cas but open source have something now besides Hadoop.
Feel comfortable knowing about running our old sas code and macros in SAS Viya Program Run-time Environment (VPRTE).
Coming from life science background, because of compliance & governance, we don't change our code (it is locked), what-if we have to reproduce the same results.
Like I said, it is helpful in knowing there is a fallback. Is VPRTE similar to workspace server (for backward compatibility) or going to be between present and future?
Reason for the question above, you mentioned the need for having a separate workspace server other than cas & VPRTE on page 9 of your paper
Thanks,
Pritesh