Adaptive SAS programming for the Software Development Life Cycle

16

Chameleons are examples of adaptability to environments for they change their colors to better blend into their surroundingThe SAS applications development process typically includes the following three phases of the software development life cycle (SDLC): Development (Dev), Testing (Test) and Production (Prod).

In order to protect the Production (or Operation) environment from unforeseen disruptions, these stages are usually implemented in their own separate environments. Sometimes, Development and Testing environments are combined into a single Development environment, which just treats Testing as an integral part of the Development stage.

(For brevity of the further narrative, we will use the two-stage process/environment paradigm, which can be easily expanded as needed.)

During the software development life cycle, when our programming modules are designed and built (developed), tested, validated and have proved to reliably produce the desired results, we promote (transition) them from Dev to the Prod environment.

The software promotions are often managed by change control procedures adopted by your organization.

Software promotion limitations

However, no matter how strict the change control procedures are or how scrupulously we conduct testing and promotion from Dev to Prod, we must recognize the limitations of this paradigm. The alternative would be a fallacy of dreams in “wishful thinking” territory.

Let’s look into two highly desirable, but rarely attainable, “dreams”.

Dream 1

If a software is validated and reliably works in Dev, it is guaranteed that it will work just fine in Prod and therefore no testing in Prod is required.

Reality 1

Although a thorough testing in Dev will keep failures in Prod to a minimum, it cannot completely eliminate them or guarantee your code’s success when running in Prod. Even if a promoted module does not require even a single change and you just copy it from Dev to Prod, there are still many possibilities that it will not work properly in Prod. Consider potential environmental differences such as data sources (structures and access), file system permissions and SAS metadata groups and permissions.

You will need to account for these environmental differences and still conduct some testing in Prod to make sure that the environment itself does not break your code/software.

Dream 2

When software is developed, tested and validated in Dev, promoting it to Prod is just a matter of copying it from Dev to Prod without any changes/adjustments.

Reality 2

However, in actuality, again environmental differences may require adjusting your code for the new environment. For example, data sources and data targets may have different library names or library references, or different directories or files names. If your application automatically sends out an email to a distribution list of your application users, you most likely will have different distribution lists in Dev vs. Prod.

Of course, you may capture all these differences into a configuration file, which will keep all the differences between Dev and Prod environments out of your code. However, in this case, your application becomes data-driven (with the configuration file being a data driver) and, as such, the configuration file itself becomes part of your software. Obviously, these configuration files simply cannot be copied from Dev to Prod during the promotion since they are inherently different.

Therefore, you cannot say that for promotion from Dev to Prod you can just copy your software. At least that is not true for that configuration file.

This reality makes the promotion process quite sensitive and susceptible to errors. Besides, we must maintain two versions of the application (Dev and Prod) and make efforts not to mix them up and not overwrite each other.

Adaptive programming

Adaptability is the ability to adjust to different environments, much like a chameleon changes its color to better blend into its surroundings.

If we can develop a program that automatically adjusts itself to run in either the Development or Production environment, it would mean that we have found a solution making Dream 2 a reality.

You can copy such a program between the environments back and forth and you don’t need to change a thing in it. Because it is identical in both, Dev and Prod.

You can even store a single version of this adaptive program in a centralized shared location and access it to run by either the Development or Production server. This holds true even though these environments may use different data sources, different data targets, different email distribution lists…

The key for such an implementation is the program’s self-awareness when it knows which environment it is running in, and based on that knowledge adapts to the environment by adjusting its configuration variables.

But these are just words of make-believe promises. Let’s get to the proof.

Identifying the environment your SAS program runs in

Suppose you have SAS software installed in the following two environments:

  • Development, where the SAS application server runs on computer DEVSRV.YOURDOMAIN.COM
  • Production, where the SAS application server runs on computer PRODSRV.YOURDOMAIN.COM

If you have SAS® Enterprise BI Server or SAS® BI Server installed you may have your metadata servers either on the same device/computer/machine/server as the application servers or on separate machines. Let’s say these are separate machines: DEVMETASRV.YOURDOMAIN.COM and PRODMETASRV.YOURDOMAIN.COM.

When we run a program in a particular environment (server), we usually know in which environment we run this program. However, in order to build an adaptive program we need to make sure the program itself knows which environment it is running in.

The key here is the SYSHOSTNAME automatic macro variable that the SAS System conveniently makes available within any SAS session. It contains the host name of the computer that is running the SAS program. In our adaptive program, &SYSHOSTNAME macro variable reference will be equal to either DEVSRV or PRODSRV depending on which environment our program runs in.

Making a SAS program adapt to the Development or Production environment

Now, when our program knows its running environment, we can make it adjust to that environment.

Let’s consider a simplified scenario where our program needs to do the following different things depending on the running environment.

Adaptability use case

1. Read a source data table SALES from a metadata-bound library:

  • In DEV: library named ‘SQL CORP DEV’;
  • In PROD: library named ‘SQL CORP’.

2. Produce target Excel file, report.xlsx and save it a specified directory on a non-SAS machine:

  • In DEV: \\DEVOUTSRV\outpath
  • In PROD: \\PRODOUTSRV\outpath

3. Send out an email to a distribution list (contact group) of users when the report is ready:

Adaptive code implementation

The following SAS code illustrates this adaptive technique:

/* ----- Configuration section ----- */
 
%if "&syshostname"="DEVSRV" %then
%do; /* Development server */
   %let metasrv = DEVMETASRV;
   %let lname = SQL CORP DEV;
   %let outsrv = DEVOUTSRV;
   %let tolist = "developers@yourdomain.com";
   options symbolgen mprint mlogic fullstimer;
%end;
%else
%if "&syshostname"="PRODSRV" %then
%do; /* Production server */
   %let metasrv = PRODMETASRV;
   %let lname = SQL CORP;
   %let outsrv = PRODOUTSRV;
   %let tolist = "businessusers@yourdomain.com" "developers@yourdomain.com";
   options nosymbolgen nomprint nomlogic nofullstimer;
 
   /* adjust email distribution list based on test_trigger_YYYYMMDD.txt file existence */
   %let yyyymmdd = %sysfunc(date(),yymmddn8.);
   %if %sysfunc(fileexist(&projpath\test_trigger_&yyyymmdd..txt) %then
      %let tolist = "testers@yourdomain.com";
%end;
%else
%do; /* Unauthorized server */
   %put ERROR: This program is not designed to run on &syshostname server;
   %put ERROR: SAS session has been terminated.;
   endsas;
%end;
 
options 
   metaserver="&metasrv" metaport=8561 metarepository=Foundation metaprotocol=bridge
   metauser='service-accout-ID' metapass="encrypted-sas-password";
 
libname SRCLIB meta "&lname";
 
%let outpath = \\&outsrv\outpath;
%let outname = Report.xlsx;
 
/* ----- End of Configuration section ----- */
 
/* Produce Excel report */
ods excel file="&outpath\&outname";
proc print data=SRCLIB.SOMETABLE;
   where Product='Loan';
run;
ods excel close;
 
/* Send out email */
filename fm email to=(&tolist) from='sender@mydomain.com' subject='Your subject';
 
data _null_;
   file fm;
   put '*** THIS IS AUTOMATICALLY GENERATED EMAIL ***' //
       "Loan report job completed successfully on &syshostname server." /
       "The following Excel file has been generated: &outpath\&outname" //
       'Sincerely,' / 'Your automation team';
run;

Code highlights

As you can see, in the configuration section we conditionally set various macro variables (including &tolist distribution list) and global options depending on where this code runs, in Dev or Prod, determined by the &SYSHOSTNAME macro variable. For unauthorized &SYSHOSTNAME values, we write a relevant message to the SAS log and terminate the SAS session.

Then we establish connection to the metadata server, assign source data library SRCLIB as well as output directory (&outpath) and output file name (&outname).

Based on these conditionally defined macro variables and libref, in the remaining sections, we:

You can wrap this configuration section into a macro and store it as a separate file; then the main program would just invoke this macro. In either case, the code achieves total environmental awareness and naturally, logically adapts to the environment in which we designed it to function.

This adaptive coding approach ensures 100% equality between Dev and Prod code versions. In fact, the two versions of the code are identical. They are self-contained, self-reliant, and do not require any adjustments during promotion throughout the development life cycle. You can copy them from one environment to another (and vice versa) without fear of accidentally wiping out and replacing the right file with the wrong one.

Can we say that this implementation makes Dream 2 come true?

Testing in the Production environment

Notice a little section dubbed /* adjust email distribution list based on test_trigger_YYYYMMDD.txt file existence */ within the Production Server logic. It allows for triggering a test run, in this case by limiting the scope of email distribution in Prod. Just plant / create a file named test_trigger_YYYYMMDD.txt in a specified location &projpath and run your code on a desired test date YYYYMMDD.  You can delete this file afterwards if a full-scale run is scheduled for the same date or otherwise keep it for future reference (it becomes “harmless” and irrelevant for any other dates). You can use the same trigger file tactic to modify other parts of your code as needed in both Prod and Dev.

Even though this feature does not make Dream 1 come true, it does alleviate its Reality.

Questions? Thoughts? Comments?

Do you find this blog post useful? Do you have questions, concerns, suggestions, or comments? Please share with us below in the Comments section.

Additional Resources

Share

About Author

Leonid Batkhan

Leonid Batkhan is a long-time SAS consultant and blogger. Currently, he is a Lead Applications Developer at F.N.B. Corporation. He holds a Ph.D. in Computer Science and Automatic Control Systems and has been a SAS user for more than 25 years. From 1995 to 2021 he worked as a Data Management and Business Intelligence consultant at SAS Institute. During his career, Leonid has successfully implemented dozens of SAS applications and projects in various industries. All posts by Leonid Batkhan >>>

16 Comments

  1. Carlo Diovanni on

    I love the simplicity of BASE's code and how easy it is to demonstrate complex concepts. Some SAS solutions, seeking greater agility precisely on this "agile treadmill" have implemented very useful features in this context: SAS Intelligent Decisioning, for example, has, from version 5.6 onwards, global variables that can be updated by developers/administrators that are immediately reflected in the decision models published , dynamic reference tables that allow you to add/update "key value" to reference for example to versioned APIs for example or allow blue/green treadmill applications. Anyway, very good article and makes us think about how SAS provides such a wide range for all audiences.

  2. I think this is what a lot of larger companies already achieve in their set up macros. What's usually missing is some sort of training document to let programmers know why they are getting the in built errors. The error messages must be written so that the users understand what is happening. Can't tell you how much of my time I wasted taking down vague error messages only to find out they are macro driven.

    • Leonid Batkhan

      Thank you, Tania, for your input. Kudos to those companies that already achieve this in their set up macros. My goal was to present and popularize this technique for a broader audience.

      I wholeheartedly agree with you on the necessity of embedding into macros comprehensive and descriptive ERROR/ALERT/WARNING messages. I routinely use the following generalized technique, e.g.:

      %macro ABC(indata=);
         ...
         %if not %sysfunc(fileexist(&indata)) %then
         %do;
            %put ***MACRO &SYSMACRONAME: Source data &indata does not exist. Macro stops its execution.;
            %return;
         %end;
         ...
      %mend ABC;
      

      But this is an entirely different topic...

  3. Thank you, Leonid, as always appreciate your nice & succinct style of exposition!

    And a good reminder re the `FULLSTIMER' - a must-have option I often find missing from action....another option I find useful while adapting to an unknown/new SAS code base at an early development stage - NOFMTERR - helps to get an initial `bird's-eye' view of the code quicker.

    • Leonid Batkhan

      Good call, Véronique. However, whether you store them in a dataset, a macro or a text/configuration file, if you enter them incorrectly your program will still fail. In other words, data-driven design is not a panacea since the data-driver (your dataset with email addresses) itself becomes an integral part of your program.

      • Michael O'Neil on

        Leonid,

        regardless of whether the email addresses are embedded in code (bad practice) or stored in a configuration file (or sas dataset) it should be one of your tests to verify that this part of the code runs without error. If you have a send_email macro, then you can test it independently and use it in any program that needs to send emails.
        regards
        Mike O'Neil

        • Leonid Batkhan

          Mike, I don't understand your point. Of course, you need to test each part of your code, whether it is a macro or not (I did not wrap everything in macro for illustrative purpose only). With send_email macro one can make sure that it's perfect, but still can introduce an error while invoking it (e.g. passing in incorrect parameters or parameter values such as email addresses). My point was to encourage SAS developers to build their code as a single version that works in all desired SDLC environments. The choice of configuration format (macro, text file, data set) is a matter of preference. I disagree with your characterization of the email addresses embedded in configuration code as "bad practice" - try inserting a misspelled email address into non-code configuration file, still your code will choke on it.

  4. Excellent topic and this kind of topic is relevant and useful for beginners as this kind of gives awareness about how things work in an organization.

Leave A Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to Top