There are many compelling reasons existing SAS users might want to start integrating SAS Viya into their SAS9 programs and applications. For me, it comes down to ease-of-use, speed, and faster time-to-value. With the ability to traverse the (necessarily iterative) analytics lifecycle faster than before, we are now able to generate output quicker – better supporting vital decision-making in a reduced timeframe. In addition to the positive impacts this can have on productivity, it can also change the way we look at current business challenges and how we design possible solutions.
Earlier this year I wrote about how SAS Viya provides a robust analytics environment to handle all of your big data processing needs. Since then, I’ve been involved in testing the new SAS Viya 3.3 software that will be released near the end of 2017 and found some additional advantages I think warrant attention. In this article, I rank order the main advantages of SAS Viya processing and new capabilities coming to SAS Viya 3.3 products. While the new SAS Viya feature list is too long to list everything individually, I’ve put together the top reasons why you might want to start taking advantage of SAS Viya capabilities of the SAS platform.
1. Multi-threaded everything, including the venerable DATA-step
In SAS Viya, everything that can run multi-threaded - does. This is the single-most important aspect of the SAS Viya architecture for existing SAS customers. As part of this new holistic approach to data processing, SAS has enabled the highly flexible DATA step to run multi-threaded, requiring very little modification of code in order to begin taking advantage of this significant new capability (more on that in soon-to-be-released blog). Migrating to SAS Viya is important especially in those cases where long-running jobs consist of very long DATA steps that act as processing bottle-necks where constraints exist because of older single-threading configurations.
2. No sorting necessary!
While not 100% true, most sort routines can be removed from your existing SAS programs. Ask yourself the question: “What portion of my runtimes are due strictly to sorting?” The answer is likely around 10-25%, maybe more. In general, the concept of sorting goes away with in-memory processing. SAS Viya does its own internal memory shuffling as a replacement. The SAS Viya CAS engine takes care of partitioning and organizing the data so you don’t have to. So, take those sorts out your existing code!
3. VARCHAR informat (plus other “variable-blocking” informats/formats)
Not available in SAS 9.4, the VARCHAR informat/format allows you to store byte information without having to allocate room for blank spaces. Because storage for columnar (input) values varies by row, you have the potential to achieve an enormous amount of (blank space) savings, which is especially important if you are using expensive (fast) disk storage space. This represents a huge value in terms of potential data storage size reduction.
4. Reduced I/O in the form of data reads and writes from Hive/HDFS and Teradata to CAS memory
SAS Viya can leverage Hive/HDFS and Teradata platforms by loading (lifting) data up and writing data back down in parallel using CAS pooled memory. Data I/O, namely reading data from disk and converting it into a SAS binary format needed for processing, is the single most limiting factor of SAS 9.4. Once you speed up your data loading, especially for extremely large data sets, you will be able to generate faster time to results for all analyses and projects.
5. Persisted data can stay in memory to support multiple users or processing steps
Similar to SAS LASR, CAS can be structured to persist large data sets in memory, indefinitely. This allows users to access the same data at the same time and eliminates redundancy and repetitive I/O, potentially saving valuable compute cycles. Essentially, you can load the data once and then as many people (or processing steps) can reuse it as many times as needed thereafter.
6. State-of-the-art Machine Learning (ML) techniques (including Gradient Boosting, Random Forest, Support Vector Machines, Factorization Machines, Deep Learning and NLP analytics)
All the most popular ML techniques are represented giving you the flexibility to customize model tournaments to include those techniques most appropriate for your given data and problem set. We also provide assessment capabilities, thus saving you valuable time to get the types of information you need to make valid model comparisons (like ROC charts, lift charts, etc.) and pick your champion models. We do not have extreme Gradient Boosting, Factorization Machines, or a specific Assessment procedure in SAS 9.4. Also, GPU processing is supported in SAS Viya 3.3, for Deep Neural Networks and Convolutional Neural Networks (this has not be available previously).
7. In-memory TRANSPOSE
The task of transposing data amounts to about 80% of any model building exercise, since predictive analytics requires a specialized data set called a ‘one-row-per-subject’ Analytic Base Table (ABT). SAS Viya allows you transpose in a fraction of the time that it used to take to develop the critical ABT outputs. A phenomenal time-saver procedure that now runs entirely multi-threaded, in-memory.
The ability to code from external interfaces gives coders the flexibility they need in today’s fast-moving programming world. SAS Viya supports native language bindings for Lua, Java, Python and R. This means, for example, that you can launch SAS processes from a Jupyter Notebook while staying within a Python coding environment. SAS also provide a REST API for use in data science and IT departments.
9. Improved model build and deployment options
The core of SAS Viya machine learning techniques support auto-tuning. SAS has the most effective hyper-parameter search and optimization routines, allowing data scientists to arrive at the correct algorithm settings with higher probability and speed, giving them better answers with less effort. And because ML scoring code output is significantly more complex, SAS Viya Data Mining and Machine Learning allows you to deploy compact binary score files (called Astore files) into databases to help facilitate scoring. These binary files do not require compilation and can be pushed to ESP-supported edge analytics. Additionally, training within event streams is being examined for a future release.
10. Tons of new SAS visual interface advantages
A. Less coding – SAS Viya acts as a code generator, producing batch code for repeatability and score code for easier deployment. Both batch code and score code can be produced in a variety of formats, including SAS, Java, and Python.
B. Improved data integration between SAS Viya visual analytics products – you can now edit your data in-memory and pass it effortlessly through to reporting, modeling, text, and forecasting applications (new tabs in a single application interface).
C. Ability to compare modeling pipelines – now data scientists can compare champion models from any number of pipelines (think of SAS9 EM projects or data flows) they’ve created.
D. Best practices and white box templates – once only available as part of SAS 9 Rapid Predictive Modeler, Model Studio now gives you easy access to basic, intermediate and advanced model templates.
E. Reusable components – Users can save their best work (including pipelines and individual nodes) and share it with others. Collaborating is easier than ever.
11. Data flexibility
You can load big data without having all that data fit into memory. Before in HPA or LASR engines, the memory environment had to be sized exactly to fit all the data. That prior requirement has been removed using CAS technology – a really nice feature.
12. Overall consolidation and consistency
SAS Viya seeks to standardize on common algorithms and techniques provided within every analytic technique so that you don’t get different answers when attempting to do things using alternate procedures or methods. For instance, our deployment of Stochastic Gradient Descent is now the same in every technique that uses that method. Consistency also applies to the interfaces, as SAS Viya attempts to standardize the look-and-feel of various interfaces to reduce your learning curve when using a new capability.
The net result of these Top 12 advantages is that you have access to state-of-the-art technology, jobs finish faster, and you ultimately get faster time-to-value. While this idea has been articulated in some of the above points, it is important to re-emphasize because SAS Viya benefits, when added together, result in higher throughputs of work, a greater flexibility in terms of options, and the ability to keep running when other systems would have failed. You just have a much greater efficiency/productivity level when using SAS Viya as compared to before. So why not use it?