If you’ve taken one of my SAS classes you may recall “Mark’s 3 rules of programming”, the first of which is “Lazy programmers are GOOD programmers.” One of the things I love about best about SAS programming is the plethora of functions and shortcuts built into the language, all designed to make it easier and quicker for me to accomplish my work - just great for lazy guys like me! SAS variable lists are a prime example - I use them all the time. Arguably my favorite variable list is the name prefix list which allows easy listing of all variable names beginning with the same text pattern. Recently a SAS student, enthused upon discovering the name prefix list, asked “How about variables with names ending with the same characters? Is there a name suffix variable list for that?” I immediately thought “How useful that would be!” but alas, there is no “name suffix list” built into SAS. Sounds like a great suggestion for the annual SASWare Ballot! Of course, if you write a blog series called “Jedi SAS Tricks” you can’t just leave a challenge like that unanswered… Let me show you what I mean! The examples use SASHELP.Iris data. Here’s a small sample:
To print a report containing the Species name and all the Sepal information, we could write the following program:
proc print data=sashelp.iris noobs;
var Species Sepal:;
run; |
SAS interprets “Sepal:” as “all variables with names that begin with ‘Sepal’.” The resulting report is just what I intended, with minimal typing required on my part:
Being fond of SAS Macro, I set myself a goal to be able to write a program like this:
%let DSN=sashelp.iris;
proc print data=&dsn noobs;
var Species %Suffix(&dsn,Length);
run; |
The program would then produce a report like this:
Now, usually I’d jump right into PROC SQL and use the Dictionary tables to extract the list of variables. Something like this would do the trick:
proc sql noprint;
select Name
into :VarList separated by ' '
from dictionary.columns
where libname='SASHELP' and memname='IRIS'
and name like '%Length'
;
quit;
%PUT The variable list contains: &VarList; |
That produces the list I want, but because I want to be able to use this macro “in line” with other SAS code, the macro can’t generate SAS code to obtain the list values. Luckily, I remembered the metadata view SASHELP.VCOLUMN contains the same information. Now, to write code that can read from that view directly in macro, without generating DATA step code. The trick will be in judicious use of %SYSFUNC and the SAS File I/O functions EXIST, OPEN, FETCH, VARNUM, GETVARC and CLOSE. After some experimenting, here’s what I came up with:
%Macro Suffix(DSN,Suffix);
%local dsid varlist rc whr lib ds;
/* Separate Libref from the data set name for WHERE clause*/
%let ds=%QSCAN(%QUPCASE(%SUPERQ(DSN)),2);
%if &ds= %then %do;
%let lib=WORK;
%let ds=%QUPCASE(%SUPERQ(DSN));
%end;
%else %let lib=%qscan(%QUPCASE(%SUPERQ(DSN)),1);
%if not %sysfunc(exist(&lib..&ds)) %then %do;
/* Specified data set does not exist */
%PUT ERROR: Dataset %SUPERQ(DSN) does not exist.;
%PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
%GoTo EndMacro;
%end;
/* Open the dataset and find out how many variable names */
%let dsid=%sysfunc(open(&dsn,i));
%if &dsid=0 %then %do;
%put %sysfunc(sysmsg());
%GoTo EndMacro;
%end;
%let NumVars=%sysfunc(attrn(&dsid,NVARS));
/* Read all the variable names, keep the ones that match the pattern */
%let VarList=;
%do i=1 %to &NumVars;
%let ThisVar=%sysfunc(varname(&dsid,&i));
%if %sysfunc(prxmatch(/\w*%QUPCASE(&Suffix)$/,%QUPCASE(&ThisVar))) %then
%let VarList=&VarList &ThisVar;
%end;
/* Return the variable names to the input stack by resolving VARLIST */
&varlist
%Endmacro:
/* Close SASHELP.VCOLUMN */
%let rc=%sysfunc(close(&dsid));
%mend Suffix; |
That worked JUST as I wanted! Because I thought this was a “keeper”, I added some parameter error checking / documentation to the the macro and saved it with the rest of my personal SAS macros. And, of course, I'll share it with you here on my blog - full code for my final version of the macro follows below. As usual, after compiling the macro you can get syntax help by submitting %suffix(!HELP).
Until the next time, may the SAS be with you!
Mark
%Macro Suffix(DSN,Suffix);
%local dsid varlist rc whr lib ds;
/* Self documentation */
%if (&DSN= and &Suffix= ) %then %let DSN=!HELP;
%if %qupcase(%qsubstr(&DSN,1,5))=!HELP
%then %do;
%PUT NOTE: EndsWith Macro Help ********************************;
%put;
%PUT NOTE- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
%put;
%PUT NOTE- DSN: Name of dataset containing vairable names of interest;
%PUT NOTE- %NRSTR(Default is &SYSLAST);
%put;
%PUT NOTE- Suffix: Text with which the variable names end (required);
%PUT NOTE- Examples: ;
%PUT NOTE- Get names of variables that end with "ght" from sashelp.class: ;
%PUT NOTE- %NRSTR(%%EndsWith%(sashelp.class,ght%));
%put;
%PUT NOTE: End EndsWith Macro Help ****************************;
%return;
%end;
/* Parameter Validation */
%if &suffix= %then %do;
/* Suffix is required */
%PUT ERROR: You must specify a variable name suffix;
%PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
%GoTo EndMacro;
%end;
%if &DSN= %then %let DSN=&SYSLAST;
/* Separate Libref from the data set name for WHERE clause*/
%let ds=%QSCAN(%QUPCASE(%SUPERQ(DSN)),2);
%if &ds= %then %do;
%let lib=WORK;
%let ds=%QUPCASE(%SUPERQ(DSN));
%end;
%else %let lib=%qscan(%QUPCASE(%SUPERQ(DSN)),1);
%if not %sysfunc(exist(&lib..&ds)) %then %do;
/* Specified data set does not exist */
%PUT ERROR: Dataset %SUPERQ(DSN) does not exist.;
%PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
%GoTo EndMacro;
%end;
/* Open the dataset and find out how many variable names */
%let dsid=%sysfunc(open(&dsn,i));
%if &dsid=0 %then %do;
%put %sysfunc(sysmsg());
%GoTo EndMacro;
%end;
%let NumVars=%sysfunc(attrn(&dsid,NVARS));
/* Read all the variable names, keep the ones that match the pattern */
%let VarList=;
%do i=1 %to &NumVars;
%let ThisVar=%sysfunc(varname(&dsid,&i));
%if %sysfunc(prxmatch(/\w*%QUPCASE(&Suffix)$/,%QUPCASE(&ThisVar))) %then
%let VarList=&VarList &ThisVar;
%end;
/* Return the variable names to the input stack by resolving VARLIST */
&varlist
%Endmacro:
/* Close SASHELP.VCOLUMN */
%let rc=%sysfunc(close(&dsid));
%mend Suffix; |
18 Comments
Also, just to let you know, I think the macro doesn't handle underscores properly. For instance:
This will print the values of both "testa" and "test_a" variables, rather than just the desired "test_a" variable.
Humm... yep, you are correct! Well, I may be a SAS Jedi, but it looks like I'm not Yoda yet :-)
The reason for the issues with underscores is my reliance on the LIKE operator in the WHERE clause. I looked at modifying the existing subsetting code, but decided this would be much easier using a PERL regular expression. I also got to thinking about Quentin's comment on alternate methods of getting the variable names, and settled on using a couple more of the SAS File I/O functions to get the names directly from the data set instead of reading the SASHELP.COLUMNS dictionary view. Here is the revised code:
%Macro Suffix(DSN,Suffix);
%local dsid varlist rc whr lib ds;
/* Self documentation */
%if (&DSN= and &Suffix= ) %then %let DSN=!HELP;
%if %qupcase(%qsubstr(&DSN,1,5))=!HELP
%then %do;
%PUT NOTE: EndsWith Macro Help ********************************;
%put;
%PUT NOTE- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
%put;
%PUT NOTE- DSN: Name of dataset containing variable names of interest;
%PUT NOTE- %NRSTR(Default is &SYSLAST);
%put;
%PUT NOTE- Suffix: Text with which the variable names end (required);
%PUT NOTE- Examples: ;
%PUT NOTE- Get names of variables that end with "ght" from sashelp.class: ;
%PUT NOTE- %NRSTR(%%EndsWith%(sashelp.class,ght%));
%put;
%PUT NOTE: End EndsWith Macro Help ****************************;
%return;
%end;
/* Parameter Validation */
%if &suffix= %then %do;
/* Suffix is required */
%PUT ERROR: You must specify a variable name suffix;
%PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
%GoTo EndMacro;
%end;
%if &DSN= %then %let DSN=&SYSLAST;
/* Separate Libref from the data set name for WHERE clause*/
%let ds=%QSCAN(%QUPCASE(%SUPERQ(DSN)),2);
%if &ds= %then %do;
%let lib=WORK;
%let ds=%QUPCASE(%SUPERQ(DSN));
%end;
%else %let lib=%qscan(%QUPCASE(%SUPERQ(DSN)),1);
%if not %sysfunc(exist(&lib..&ds)) %then %do;
/* Specified data set does not exist */
%PUT ERROR: Dataset %SUPERQ(DSN) does not exist.;
%PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
%GoTo EndMacro;
%end;
/* Open the dataset and find out how many variable names */
%let dsid=%sysfunc(open(&dsn,i));
%if &dsid=0 %then %do;
%put %sysfunc(sysmsg());
%GoTo EndMacro;
%end;
%let NumVars=%sysfunc(attrn(&dsid,NVARS));
/* Read all the variable names, keep the ones that match the pattern */
%let VarList=;
%do i=1 %to &NumVars;
%let ThisVar=%sysfunc(varname(&dsid,&i));
%if %sysfunc(prxmatch(/\w*%QUPCASE(&Suffix)$/,%QUPCASE(&ThisVar))) %then
%let VarList=&VarList &ThisVar;
%end;
/* Return the variable names to the input stack by resolving VARLIST */
&varlist
%Endmacro:
/* Close SASHELP.VCOLUMN */
%let rc=%sysfunc(close(&dsid));
%mend Suffix;
It looks like it's still pulling out variables where the desired suffix string is in the middle of the variable name. However, I figured out a tweak you can make. If you change the line
to the following:
then that seems to work. You might need a %QSUBSTR instead of a %SUBSTR, or a %NRQUOTE or %NRBQUOTE somewhere in there, just to make sure you're addressing all possible cases, but it worked as-is with my test data:
My PERL regular expression was missing a "$" to indicate the pattern was to match only at the end of the string. I updated the code in yesterday's post. The resulting macro is much more functional and robust, thanks to your persistence and feedback. Here's the code I used to test the final macro code:
data test ;
_stests = 1;
te_st = 2;
te_st_s = 3;
test_s = ceil(10*ranuni(1));
run;
title "&SYSLAST";
proc print;run;
%let S=_s;
title "&SYSLAST - suffix &s";
proc print data=test ;
var %suffix(test,&s);
run;
%let S=st;
title "&SYSLAST - suffix &s";
proc print data=test ;
var %suffix(test,&s);
run;
title;
Muchas gracias, Gabe!
I'm not as familiar with PERL syntax... do you have a suggestion on a good resource to learn from? Also, more generally, can you comment on why/when to use PERL over another coding option?
PERL regular expressions provide a very powerful and flexible way of describing text pattern in a SAS program. You can directly leverage PERL regular expressions in SAS using the PRX functions. If you deal with a lot of text data, it is well worth the effort it takes to learn. My friend Ron Cody wrote an excellent paper titled An Introduction to Perl Regular Expressions in SAS 9 that can give you a start. I also wrote a SAS User Group workshop called Untangling the World Wide Web (download the PDF here) and pages 23-50 have a relatively simple and practical introduction to using PERL regular expressions in SAS.
If you are trying to detect a text pattern instead of just a text string, using PERL regular expressions can greatly simplify your code.
if the intention was for it to be able to be run inline within a data step, why include a macro variable for the dataset name (DSN)? Why not just automatically determine what dataset currently is being operated on?
Gabe,
The macro is actually writing part of your data step code for you - the data step itself hasn't even compiled yet. The macro produces the list of variable names and "types them in" at the appropriate spot in the data step code. Only after the entire data step is written and then compiled will the data step actually start to execute. So during the time the macro is executing to produce the list, there is actually no data set currently being operated on. Hence the need to tell the macro which data set to process.
Thanks for the reply. I guess now my question is more in general... if there's a way using any of the %sysfunc functions and arguments to determine what dataset is currently being compiled?
Gabe, there is no way I know of to detect information from a data step in the middle of compilation. You can use macro language techniques to write the data step - to add text as it is being sent to the data step compiler. And the data step can send information back to the macro symbol table during execution using data step functions. But until the data step is fully compiled, there's no communication path open that goes back to to the macro processor or to the input stack.
"Efficiency is intelligent laziness" - David Dunham
Great quote. Im too lazy to make one up myself, so I think I'll steal this one :-)
Nice macro, thanks for posting! Very inspiring. A few thoughts:
1. If you don't mind passing more work onto the user, might be nice to make this a more general tool by having the second parameter be any logical condition. So the where statement becomes something like:
%let whr=(WHERE=(LIBNAME="&lib" AND MEMNAME="&ds" and &condition)) ;
and user can call with whatever they want for the condition, i.e. specify prefix, suffix, length, etc
%put %suffix(sashelp.class,name like '%ght');
%put %suffix(sashelp.class,upcase(name) like '%A%');
%put %suffix(sashelp.class,length(name)<4);
2. It's possible to do similarly using SCL functions (called by %sysfunc) to open a dataset and review its variables. I think often the SCL route might be faster since dictionary.columns can get VERY big.
3. At least in my browser, I see "&" in the code instead of &. Could be the blog software doing that conversion for you...
Thanks again!
Thanks for the ideas for extending the macro. It's always a conundrum balancing further generalization against clarity/ease of use... As for the dictionary.coumns, it CAN get big, but WHERE subsetting is pretty efficient. And I'm not sure the SCL function calls would work for those who don't have SAS/AF licensed.
And even though I inserted "pre" tags, WordPress seems to have replaced my ampersands with "amp" in the code - sigh. I'll figure it out soon, I hope :-)
Mark
Great tip ... I didn't know I could do that with a Proc Print VAR list.
Your macro error messages are much more complete than ones I write. Hmm .. maybe I could expand beyond, "You did something wrong, start over." ;-)
LOL - better that then the typical "you submit your code and nothing happens" ....
A good and lazy programmer are you, young Jedi.
In the (almost) words of Paul Simon - "Still lazy, after all these years..."