If you're a SAS programmer who now uses SAS Viya and CAS, it's worth your time to optimize your existing programs to take advantage of the new environment. This post is a continuation of my SAS Global Forum 2020 paper Best Practices for Converting SAS® Code to Leverage SAS® Cloud Analytic Services and my SGF 2020 Super Demo.
The best approach for refactoring SAS code for SAS Viya has a few steps:
- First, "lift and shift" your existing code to run successfully in the compute server for SAS Viya.
- Next, create CASLIB statements to all of your data sources: i.e. sas7bdat, CSV files, parquet files, relational databases, cloud data sources, etc.
- Finally, identify the longest running steps so you know where you have the biggest opportunities. For example, look at steps where the "real time" is 30 minutes or longer, as well as steps that are CPU bound. CPU-bound steps are steps where the CPU time is equal to or greater than the real time for that step.
To help us identify those steps we can leverage a new utility to analyze SAS logs and create reports to help us understand the Real Time and CPU Time for each step. Read on to learn more about this final step in the code refactoring process.
The application will read SAS Batch Server, SAS Stored Process Server, and SAS Workspace Server logs and SAS 9 logs that end-users save to disk. It creates a descending Real Time (Clock Time) report and a step frequency report. To view the reports delivered with the application use SAS Viya. In addition to the reports the data sets all reports are based on can be found in the .../assessment/datamart/summarizeSASLogSteps directory. If you do not have SAS Viya you can process the sumlogs_results.sas7bdat using SAS 9. Here is the SAS code to accomplish that.
/* Windows LIBNAME sample*/ libname logs "C:\path\to\SAS 9 Content Assessmet Applications\v2021.2.5\assessment\datamart\summarizelogs"; /* Linux or AIX LIBNAME sample*/ libname logs "/path/to/SAS 9 Content Assessmet Applications/v2021.2.5/assessment/datamart/summarizelogs"; proc sort data=logs.sumlogs_results out=work.logs; by descending real_time; run; proc print data=work.logs; run; proc freq data=work.logs; table step; run;
The reports are derived by picking up on SAS log entries like this:
NOTE: PROCEDURE SGPLOT used (Total process time): real time 2.79 seconds cpu time 0.08 seconds NOTE: The SAS System used: real time 1:08.86 cpu time 1:18.18
Descending Real Time Report
Figure 1 contains an example of the descending real time report. In this report we observe in the Step column that the longest running step is a PROC LOGISTIC that takes over 14 hours (Real Time column) and from the SAS log called Sample3.log (File Name column). The best way to use this report is to focus on steps that take longer than 30 minutes. In our case we have 9 steps from 3 SAS logs. Now that we know that we can review the details of each step and then benchmark if that step would run faster by leveraging SAS® Cloud Analytic Services (CAS). Note, for CAS to process data, all data must be in CAS tables and the step must be coded using CAS-enabled steps.
How to get the Application
The application is delivered starting with SAS 9 Content Assessment v2021.2.5. To download the application click here. The name of the application is: summarizeSASLogSteps.
Documentation on how to configure and run the application is located in the ..../assessment/doc folder and called SASContentAssessment.pdf.
In order to understand which steps are good candidates for leveraging the in-memory engine CAS, we must first understand the real time and CPU time of each step. Then we can benchmark which engine in SAS Viya is appropriate for that step i.e., the compute server or CAS.