If Necessity is the mother of Invention, then, perhaps, the father of Automation is Laziness. Automation is all about convenience, comfort, and productivity. Why do it yourself if you can devise something to do it for you!
See also: How to conditionally terminate a SAS batch flow process in UNIX/LinuxIn my previous post Running SAS programs in batch under Unix/Linux, we learned that in order to automate SAS jobs submissions, they are often run in batch mode. We also learned that we usually create batch scripts as a convenient way to run SAS programs in batch. To create a unique SAS log file generated with each batch submission, a typical batch script may look like follows:
#!/usr/bin/sh dtstamp=$(date +%Y.%m.%d_%H.%M.%S) pgmname="/sas/code/project1/program1.sas" logname="/sas/code/project1/program1_$dtstamp.log" /sas/SASHome/SASFoundation/9.4/sas $pgmname -log $logname |
It will allow you to submit your SAS program /sas/code/project1/program1.sas in batch, and also capture SAS log file with a convenient date-time suffix in the same directory.
SAS program to write batch scripts
But what if we are deploying multiple SAS programs? Well, then we would need to create a batch script for each of them. They will all look similar to each other, and that is when most human errors usually occur – when we do something similar, monotonously, over and over again. Besides, I found working with the Unix Visual Editor (“vi editor”) is not quite a 21st century experience.
What would a normal SAS programmer do in such a situation? That’s right – we would write a SAS program to write a batch script file! Let’s do it. Let’s automate the automation.
In its simplest form, to replicate the above batch script example our SAS program would look like this:
filename b '/sas/code/project1/program1.sh'; data _null_; file b; input; put _infile_; datalines; #!/usr/bin/sh dtstamp=$(date +%Y.%m.%d_%H.%M.%S) pgmname="/sas/code/project1/program1.sas" logname="/sas/code/project1/program1_$dtstamp.log" /sas/SASHome/SASFoundation/9.4/sas $pgmname -log $logname ; |
Setting up batch file permissions
As we already know from my previous post, we need to assign certain permissions to our batch file in order to make it executable. For example, if you want to give yourself (Owner) and Group execution permissions then your script file permissions can be as:
-rwxr-x---, or 750 in octal representation.
In order to do that you can to add to your SAS code the following x-statement:
options noxwait; x 'chmod 750 /sas/code/project1/program1.sh'; |
Alternatively, you can use %SYSEXEC macro statement (no quoting for the OS command) or SYSTASK statement, or CALL SYSTEM routine (used within a data step).
When you create a batch script by running the above code in SAS Enterprise Guide (EG), you don’t have to leave the comfort of your SAS environment or even touch Unix vi editor. Moreover, you can even submit your SAS job in batch mode right from your SAS EG Program Editor.
However, all of these will work fine, unless XCMD System Option is disabled (NOXCMD).
Assigning batch file permissions when XCMD System Option is disabled
ERROR: Shell escape is not valid in this SAS session. |
Bummer! Have you ever seen this error message in SAS Enterprise Guide while trying to run SAS code with the X statement? It indicates that executing OS commands in the SAS environment is not allowed.
In many organizations, IT department policies do not allow enabling the SAS XCMD system option due to cyber-security concerns. This is usually done system-wide for the whole SAS Enterprise client-server installation via SAS configuration. In this case, no operating system command is allowed to be executed from within SAS.
Of course, this substantially limits SAS’ automation power, but that is the goal and the price to pay for enhanced security.
Still, even without OS command execution at our disposal, we can set Unix script file permissions using FILENAME statement’s PERMISSION= option. Then our above filename statement will look like this:
filename b '/sas/code/project1/program1.sh' permission='A::u::rwx,A::g::r-x,A::o::---'; |
Permission string 'A::u::rwx,A::g::r-x,A::o::---' here signifies the following.
A stands for Access permissions,
u - user who owns the file (owner),
g - group to which the user belongs,
o - other (not the owner or the owner's group).
To grant access permissions, use the values r (Read), w (Write), and x (Execute), in that order. To deny one of these permissions, enter a – in its place (for example, r-x, means Write permission denied).
A::u::rwx means user gets Read, Write, and Execute permissions GRANTED,
A::g::r-x means group gets Read and Execute permissions GRANTED, Write permission DENIED,
A::o::--- means other gets none of the access permissions granted (all of them are DENIED).
However, it is important to realize that your ability to fully control file permissions via FILENAME statement’s PERMISSION= option is still restricted by the Unix umask value set by your IT system administrator. But usually, it is not overly restrictive, at least for the purpose of creating executable files in the environments I have worked with.
The double benefit of the FILENAME statement’s PERMISSION= option is that it can be used for setting up file permissions in any SAS installation whether the XCMD system option is enabled or disabled.
SAS macro to create batch script files
Let’s wrap all the above SAS code pieces into a SAS macro that writes batch scripts. Here is the macro code definition:
%macro write_shell(code); %let fdir = %substr(&code,1,%sysfunc(findc(&code,/,b))); options dlcreatedir; libname _flib "&fdir"; libname _flib; %let core = %substr(&code,1,%eval(%length(&code)-4)); filename _fout "&core..sh" permission='A::u::rwx,A::g::r-x,A::o::---'; data _null_; file _fout; put '#!/bin/sh' // 'now=$(date +%Y.%m.%d_%H.%M.%S)' / "pgmname=""&code""" / "logname=""&core._$now.log""" / 'sas $pgmname -log $logname' ; run; filename _fout; %mend write_shell; |
The single macro parameter (code) represents full path name of your SAS code. And here is a macro invocation example:
%write_shell(/sas/code/project1/program1.sas) |
The assumption here is that the script file gets created in the same directory as the relevant SAS code and SAS logs for each of the batch runs. It will be assigned the same name as your SAS program, only with the .sh name extension. As you can see, we do some string parsing to derive directory name, script file name and SAS log file name from the single macro parameter representing full path name of your SAS code. As an added bonus, if a specified directory (/sas/code/project1/) does not yet exist, it will be created by this macro. DLCREATEDIR System Option (along with the two subsequent libname statements) are responsible for the directory creation.
If you want to create many script files for your multiple SAS programs, you just invoke the macro as many times. You can even go totally data-driven for mass script file creation.
Do you find this useful?
Please let me know in the comments section below if you find this blog post useful. Thank you for reading! I also invite you to share your ideas and experiences on the topic.
See also: How to conditionally terminate a SAS batch flow process in UNIX/Linux
29 Comments
..timestamp_programname.log.ok OR ..timestamp_programname.log.error
For some reason the site is removing anything within the less than,greater than arrows which doesnt help the explanation??
That is because HTML treats < and > as HTML tag boundaries. To circumvent that you can use
<
and>
correspondingly, HTML will display them properly as < and >. That is what I did to correct filenames in your prior comment. Or enclose word containing < and > within <code> . . . </code> tags.Hi Leonid,
Similar problems spawn similar solutions around the world. I wasn't aware of the permission option. Thanks.
In 2006 I created a similar SAS macro which would a) write a .sh script b) create the log <timestamp>_<programname>.log. and c) email the submitting user the log on completion. In this way the script contains the date context of the program (could be daily, weekly or monthly) and completion status.
Combining this with a 'master' sas program running continuously every 5min a SAS developer with little experience can create a fully functional batch schedule with inter-dependencies written completely in SAS code. Who would think 14 years later it runs multiple organisations entire SAS batches. I've expanded it to allow SAS GRID script creation when a Grid queue is specified.
Scripting really is something mechanical which should rarely be done by hand.
The log created is either <timestamp>_<programname>.log.ok or <timestamp>_<programname>.log.error, so the log name indicates running, OK, or ERROR.
Thank you, David, for your feedback. You made a good point by embedding 'ok' or 'error' into the log name which facilitates logs inspection by just looking at the directory contents. And I agree, scripting are quite mundane and in many cases should be automated. That's why I wrote this blog post.
Personally, I would rather use the following naming convention: <programname>_<timestamp>_ok.log and <programname>_<timestamp>_error.log to group logs by programname, and also preserve the '.log' file extension.
Thanks Leonid, also for your feedback below on HTML comment workarounds, I thought that it would interpret comments 'as-is' and not as HTML but you learn something new every day. 🙂
Using the OK or ERROR before the .log is useful as it preserves the 'log' type. Because the timestamp I use is always numeric e.g. 20200821 (for 21aug21), it sorts in runtime order, then program order, so it's easier to see progress of large batches, but the reverse as you point out allows sorting by program name.
Agree. Naming pattern should serve its purpose. In general, solutions should be driven by purpose.
I am looking forward to trying this.
Hi John, I am looking forward to hearing from you after you have it tried out 🙂
Great post. Thanks for the idea.
You are welcome, Mark! I am glad you liked it, and hope you put it to use.
This is great. Thanks for taking the time to share it. I use a lot of scripts to run SAS jobs via Cron. Now I can incorporate the script generating into the SAS Code so everything is documented in one place. Thanks!
Thank you, John, for your feedback. I am really glad you liked it and put to work.
Understood it. Thanx for helping out Leonid Batkhan.
This topic is very useful.
Other positive benefits of launching SAS processes in batch are "repeatability" and "reliability". I found these particularly important in testing.
Peter, thank you for your feedback. While I agree that batch's "repeatability" and "reliability" are "particularly important in testing", I would add that they are not less important in production.
Great series of posts. I too was unaware of the permission option!
Since I'm not experienced in writing shell scripts, recently I've been playing with using a parent SAS program which uses SYSTASK to spawn child SAS sessions, like : systask command """%sysget(SASROOT)\sas.exe"" -noterminal -sysin ""...\myprogram.sas""" ; A benefit of this is instead of learning how to create a timestamp or conditional logic in the shell script language, I can use SAS macro language. I was inspired by Troy Hughes' book, and many of his papers like http://support.sas.com/resources/papers/proceedings17/0870-2017.pdf.
One use case for this I really like is test scripts. The parent session can have a macro which does some setup of the test environment, then uses systask to call the program being tested, and then confirms that the results of the program match the expectation, log does not have errors etc. Having all of the logic in the parent program as SAS code rather than shell script code helps with portability etc.
Quentin, thank you for your feedback and for sharing your ways of running SAS jobs.
Leonid,
Another very useful, very well-written, and fun-to-read article! I am squirreling this one away for when I need this someday.
Looking forward to your next article(s)!
----Michael
Michael, thank you, I really appreciate your nice words.
Ditto on Chris's comment!!!
Jim, thank you. Now you are in the know too!
Leonid, the FILENAME's PERMISSION= option is the golden nugget of this blog post, at least for me. I didn't know about that one!
Thank you, Chris, for your feedback. You made my day!
DLCREATEDIR is another one, for me at least.
Thanks, Anton, for dropping by and providing your valuable feedback. Actually, I myself learned about the DLCREATEDIR option from ... Chris, see his blog post SAS trick: get the LIBNAME statement to create folders for you.
"In many organizations, IT department policies do not allow enabling the SAS XCMD system option due to cyber-security concerns."
Sorry to disagree. IT departments are believing the default options of SAS are the secure of working. There must be SAS guys having reasons to do this as default.
The It departments attitude: Do not think or bother further just accept those defaults.
Please explain why using the power od SAS would be a cybersecuriyt issue when all cybersecurity controls on Linux are in place.
The result is SAS is harming SAS by some hayman reasoning
I did the approach of creating many scripts on several occasions.
It solves the issue of having too big sized data or too many files pretty easy.
Executing the process can have an excellent perfromance.
It is way of problem solving that is very unique when undertanding several worlds.
Hi Jaap, thank you for your feedback. I agree with you in that proper Linux security controls should be enough. However, different organizations have their own IT policies in place and "SAS guys" (administrators, installers) must follow those policies. But I hear what you are saying, sometimes we have to deal with people ("SAS guys" or non-SAS) having unreasonable "reasons" and who are in charge. At least FILENAME PERMISSION= option gives us some advantage without having to deal with them.