I hate hard-coding variable lists in my SAS code. I love the special variable lists SAS provides, like name:, _numeric_, etc. But back in 2011, I wrote a post titled Jedi SAS Tricks - Building a Name Suffix variable list lamenting the lack of a special variable list in base SAS which would produce the variables ending with a particular text string. And, of course, I built a custom macro to do the deed for me.
My fellow SAS Jedi, Bruno Mueller of SAS Switzerland, just sent me what he modestly described as "a new version of the suffix macro". But it really was an awesome new approach that will completely replace the suffix macro in my personal autocall library! Bruno's varListPattern macro allows you to create variable lists for any SAS data set in-line with your code, using pattern search to define the list. I added a bit of code to make the macro self-documenting (use %varListMacro(!HELP) to produce syntax help in the SAS log). Because pattern matching can be a little tricky, I added a parameter that allows running the macro in "debug" mode for testing. In DEBUG mode the macro writes the variable list and some helpful debugging info to the SAS log instead of generating the list in-line with your SAS code.
To use the macro, first download the zip file for this post and unzip the two files it contains. Save the macro definition program varListPattern.sas to your autocall path (or, for one-time use or testing, just submit the program in your SAS session). The other file, TestingvarListPattern.sas, provides some examples of how to use the macro. In one of the examples, we write to the SAS log a list of numeric variables in SASHELP.MON1001 where the name has '9' as the second from the last character:
%varListPattern(sashelp.mon1001,*9?,N,D)
And the result:
NOTE: (VARLISTPATTERN) DSN=sashelp.mon1001 VARTYPE=N PATTERN=*9? NVARS=407 PRXPATTERN=.*9. VARLIST=S0390 S0391 S0393 S0394 S0395 S0396 S0397 S0398 S0490 S0491 S0492 S0493 S0495 S0496 S0497 S0498 S0499 S0590 S0591 S0592 S0593 S0594 S0595 S0596 S0597 S0598 S0599 S0690 S0692 S0694 S0695 S0697 S0698 S0699 S0790 S0791 S0792 S0793 S0794 S0795 S0796 S0798 S0799 S0897 S0898 S0899 S0990 S0991 S0992 S0993 S0994 S0995 S0997 S0998 S0999
In another example, we use the macro in-line within PROC PRINT to print the data set SASHELP.CLASS - but only those variables with 'E' as the second letter of the name!
title 'SASHELP.CLASS variables with E as the second letter'; proc print data=sashelp.class (obs=5); var %varListPattern(sashelp.class,?e*); run; title; |
Now, that is one handy little macro! Thanks, Bruno, for sharing it with me - and now with everyone who reads this post :-)
Until next time, may the SAS be with you!
Mark
15 Comments
Very helpful. %VARLIST() macro functions like this are very useful. It bugs me a little that you can't specify parenthetical options on the dataset name, e.g. %put %varListPattern(sashelp.class(keep=_numeric_),?e*). That attempt showed two surprises for me: 1. %sysfunc(exist(sashelp.class(keep=_numeric_))) returns 0; 2. %sysfunc( varname(&dsid, &i) ) doesn't honor keep/drop option.
Thanks for the feedback, Quentin. Because _NUMERIC_ is already a SAS supported variable list, you don't need the %varListPattern macro to generate that list for you. There is a parameter which allows you to select only numeric or character variables with this macro, though. And if you already have a list of variables names that you want (as you would use in a KEEP+ or DROP= option) you shouldn't need this macro - it was designed to generate that list for you...
All the best,
Mark
Thanks Mark, bad example on my part. Better example would be about a DROP= option, e.g. %put %varListPattern(sashelp.class(drop=Name),*a*);
Generally for this sort of utility macro, I will end up calling it like %varlistPattern(&data), where &data was passed by some user. It's nice for such utilities to allow any valid dataset name, allowing parenthetical options. That is if I write a macro %DoSomething(data=), which happens to call %varlistPattern(&data) inside of it, I want my users to be able to pass any valid data set name when they call %DoSomething. So for example if they want to test, they could do %DoSomething(data=mydata(obs=100)), without it breaking.
I considered allowing dataset options on the DSN parameter, but eventually vetoed the idea. The only data set options germane to list making would be DROP= and KEEP=. I need to use a DO loop to retrieve the variable names. To know how many times to iterate the loop I read the number of variables (NVARS) from the dataset metadata using the ATTRN function. If I subsequently open the dataset with a DROP= or KEEP= dataset option applied, fewer variables are actually available to the process than were reported by ATTRN, and the DO loop continues to try to read additional variables after the last VARNAME has been accessed. This produces a system generated WARNING in the log:
WARNING: Argument 2 to function VARNAME referenced by the %SYSFUNC or %QSYSFUNC macro function is out of range.
To suppress the warning, I'd have to generate actual SAS code (an OPTIONS statement) which would preclude using the macro inline with other code. While I could pose some (very few) use cases for a macro designed to make variable lists which would not want the all of the variables in the data set regarded as input, we already had the ability to exclude numeric or character values, and this seemed like a reasonable, useful tradeoff. And there you have my reasoning for the design of this macro!
All the best,
Mark
Regards,
Mark
Thanks again. Yes, it's all tradeoffs. That VARNAME() function limitation is one I hit when I tried to do this before. It still irks me: ) A few years back Soren Lassen posted an interesting approach on overcoming this to SAS-L, which I implemented a version of in this post. https://listserv.uga.edu/cgi-bin/wa?A2=SAS-L;8ce2cb2d.1012c . The goal of that macro is to expand a list of variables (any kind of valid variable lists. i.e. -- or - or : or whatever), into a space delimited list of the actual variables. It's all tradeoffs. My macro will still break if rename= is passed. Rick Langston has a lovely DOSUBL %ExpandVarList in https://support.sas.com/resources/papers/proceedings13/032-2013.pdf. The regex filtering of %VarListPattern is nice. I suppose Rick's macro could be expanded to support parenthetical options, and could have a WHERE clause in the PROC SQL with a LIKE for the subsetting. Now that DOSUBL is around, I'm going to try to force myself to start using it rather than the traditional SCL functions in my function-style macros, and see how far I can get. Thanks again for all the posts, and for tolerating the discussion. : )
As always, another great post with useful tips and code!
Thank you, Kim :-)
Mark
Leonid beat me to it... Thanks, Mark.
It sure is good that I have the whole SAS Jedi Council watching my back out there!
Thanks,
Mark
Leonid Batkhan noted a problem with the test for the null varlist - a %do group was required, and I left out the %do and %end:
%if %sysevalf( %superq(Varlist) = , boolean ) = 1 %then
%do;
%put WARNING: (&sysmacroname) No variables matching your criteria were found in %superq(dsn).;
%put WARNING- &=dsn &=varType &=pattern &=nvars &=prxPattern;
%put;
%end;
I have fixed the program, repackaged it in a new ZIP file and linked to the new version in the post. If you downloaded before 13:40 Easter Time, you might want to grab the code again...
Sorry for the inconvenience!
Mark
This is awesome - going right into use! Thanks for posting.
Woot! You just made my day, Louise :-)
Mark
Please grab the updated code from http://blogs.sas.com/content/sastraining/files/2017/08/Jedi_SAS_Tricks_Variable_Lists_By_Pattern-1.zip
I am not sure, but I think there is a "%PUT" missing in the second to last line of the code:
%else %PUT %superq(varlist);
Actually, no. See the note above the line to help expalain what's going on:
/* Otherwise, in PRODUCTION mode - produce the list in-line with the SAS code */
%else %superq(varlist);
Because %superq(varlist) executes without being part of a %let or %put statement, the text it produces is pushed on top of the input stack. So in effect, the %superq(varlist) text "disappears" and is replaced with the generated variable list in your code.
Check it out with this test code:
data new1;
set sashelp.class;
run;
proc contents data=new1;
run;
/* Let's drop of all variables with 'e' as the second letter */
data new2 (drop=%VARLISTPATTERN(sashelp.class,?e*));
set sashelp.class;
run;
proc contents data=new2;
run;