Jedi SAS Tricks - Variable Lists by Text Pattern

15

Image of a Jedi overlaid with "Jedi SAS Tricks: Variable Lists by Pattern"I hate hard-coding variable lists in my SAS code. I love the special variable lists SAS provides, like name:, _numeric_, etc. But back in 2011, I wrote a post titled Jedi SAS Tricks - Building a Name Suffix variable list lamenting the lack of a special variable list in base SAS which would produce the variables ending with a particular text string. And, of course, I built a custom macro to do the deed for me.

My fellow SAS Jedi, Bruno Mueller of SAS Switzerland, just sent me what he modestly described as "a new version of the suffix macro". But it really was an awesome new approach that will completely replace the suffix macro in my personal autocall library! Bruno's varListPattern macro allows you to create variable lists for any SAS data set in-line with your code, using pattern search to define the list. I added a bit of code to make the macro self-documenting (use %varListMacro(!HELP) to produce syntax help in the SAS log). Because pattern matching can be a little tricky, I added a parameter that allows running the macro in "debug" mode for testing. In DEBUG mode the macro writes the variable list and some helpful debugging info to the SAS log instead of generating the list in-line with your SAS code.

To use the macro, first download the zip file for this post and unzip the two files it contains. Save the macro definition program varListPattern.sas to your autocall path (or, for one-time use or testing, just submit the program in your SAS session). The other file, TestingvarListPattern.sas, provides some examples of how to use the macro. In one of the examples, we write to the SAS log a list of numeric variables in SASHELP.MON1001 where the name has '9' as the second from the last character:

%varListPattern(sashelp.mon1001,*9?,N,D)

And the result:
NOTE: (VARLISTPATTERN) DSN=sashelp.mon1001 VARTYPE=N PATTERN=*9? NVARS=407 PRXPATTERN=.*9.
      VARLIST=S0390 S0391 S0393 S0394 S0395 S0396 S0397 S0398 S0490 S0491 S0492 S0493
S0495 S0496 S0497 S0498 S0499 S0590 S0591 S0592 S0593 S0594 S0595 S0596 S0597 S0598 S0599
S0690 S0692 S0694 S0695 S0697 S0698 S0699 S0790 S0791 S0792 S0793 S0794 S0795 S0796 S0798
S0799 S0897 S0898 S0899 S0990 S0991 S0992 S0993 S0994 S0995 S0997 S0998 S0999

In another example, we use the macro in-line within PROC PRINT to print the data set SASHELP.CLASS - but only those variables with 'E' as the second letter of the name!

title 'SASHELP.CLASS variables with E as the second letter';
proc print data=sashelp.class (obs=5);
   var %varListPattern(sashelp.class,?e*);
run;
title;

And the result:
Image of PROC PRINT output showing the variables Sex, Height and Weight were selected

Now, that is one handy little macro! Thanks, Bruno, for sharing it with me - and now with everyone who reads this post :-)

Until next time, may the SAS be with you!
Mark

Tags
Share

About Author

SAS Jedi

Principal Technical Training Consultant

Mark Jordan (a.k.a. SAS Jedi) grew up in northeast Brazil as the son of Baptist missionaries. After 20 years as a US Navy submariner pursuing his passion for programming as a hobby, in 1994 he retired, turned his hobby into a dream job, and has been a SAS programmer ever since. Mark writes and teaches a broad spectrum of SAS programming classes, and his book, "Mastering the SAS® DS2 Procedure: Advanced Data Wrangling Techniques" is in its second edition. When he isn’t writing, teaching, or posting “Jedi SAS Tricks”, Mark enjoys playing with his grand and great-grandchildren, hanging out at the beach, and reading science fiction novels. His secret obsession is flying toys – kites, rockets, drones – and though he usually tries to convince Lori that they are for the grandkids, she isn't buying it. Mark lives in historic Williamsburg, VA with his wife, Lori, and Stella, their cat. To connect with Mark, check out his SAS Press Author page, follow him on Twitter @SASJedi or connect on Facebook or LinkedIn.

Related Posts

15 Comments

  1. Very helpful. %VARLIST() macro functions like this are very useful. It bugs me a little that you can't specify parenthetical options on the dataset name, e.g. %put %varListPattern(sashelp.class(keep=_numeric_),?e*). That attempt showed two surprises for me: 1. %sysfunc(exist(sashelp.class(keep=_numeric_))) returns 0; 2. %sysfunc( varname(&dsid, &i) ) doesn't honor keep/drop option.

    • SAS Jedi

      Thanks for the feedback, Quentin. Because _NUMERIC_ is already a SAS supported variable list, you don't need the %varListPattern macro to generate that list for you. There is a parameter which allows you to select only numeric or character variables with this macro, though. And if you already have a list of variables names that you want (as you would use in a KEEP+ or DROP= option) you shouldn't need this macro - it was designed to generate that list for you...

      All the best,
      Mark

      • Thanks Mark, bad example on my part. Better example would be about a DROP= option, e.g. %put %varListPattern(sashelp.class(drop=Name),*a*);

        Generally for this sort of utility macro, I will end up calling it like %varlistPattern(&data), where &data was passed by some user. It's nice for such utilities to allow any valid dataset name, allowing parenthetical options. That is if I write a macro %DoSomething(data=), which happens to call %varlistPattern(&data) inside of it, I want my users to be able to pass any valid data set name when they call %DoSomething. So for example if they want to test, they could do %DoSomething(data=mydata(obs=100)), without it breaking.

        • SAS Jedi

          I considered allowing dataset options on the DSN parameter, but eventually vetoed the idea. The only data set options germane to list making would be DROP= and KEEP=. I need to use a DO loop to retrieve the variable names. To know how many times to iterate the loop I read the number of variables (NVARS) from the dataset metadata using the ATTRN function. If I subsequently open the dataset with a DROP= or KEEP= dataset option applied, fewer variables are actually available to the process than were reported by ATTRN, and the DO loop continues to try to read additional variables after the last VARNAME has been accessed. This produces a system generated WARNING in the log:

          WARNING: Argument 2 to function VARNAME referenced by the %SYSFUNC or %QSYSFUNC macro function is out of range.

          To suppress the warning, I'd have to generate actual SAS code (an OPTIONS statement) which would preclude using the macro inline with other code. While I could pose some (very few) use cases for a macro designed to make variable lists which would not want the all of the variables in the data set regarded as input, we already had the ability to exclude numeric or character values, and this seemed like a reasonable, useful tradeoff. And there you have my reasoning for the design of this macro!
          All the best,
          Mark

          Regards,
          Mark

          • Thanks again. Yes, it's all tradeoffs. That VARNAME() function limitation is one I hit when I tried to do this before. It still irks me: ) A few years back Soren Lassen posted an interesting approach on overcoming this to SAS-L, which I implemented a version of in this post. https://listserv.uga.edu/cgi-bin/wa?A2=SAS-L;8ce2cb2d.1012c . The goal of that macro is to expand a list of variables (any kind of valid variable lists. i.e. -- or - or : or whatever), into a space delimited list of the actual variables. It's all tradeoffs. My macro will still break if rename= is passed. Rick Langston has a lovely DOSUBL %ExpandVarList in https://support.sas.com/resources/papers/proceedings13/032-2013.pdf. The regex filtering of %VarListPattern is nice. I suppose Rick's macro could be expanded to support parenthetical options, and could have a WHERE clause in the PROC SQL with a LIKE for the subsetting. Now that DOSUBL is around, I'm going to try to force myself to start using it rather than the traditional SCL functions in my function-style macros, and see how far I can get. Thanks again for all the posts, and for tolerating the discussion. : )

  2. SAS Jedi

    Leonid Batkhan noted a problem with the test for the null varlist - a %do group was required, and I left out the %do and %end:
    %if %sysevalf( %superq(Varlist) = , boolean ) = 1 %then
    %do;
    %put WARNING: (&sysmacroname) No variables matching your criteria were found in %superq(dsn).;
    %put WARNING- &=dsn &=varType &=pattern &=nvars &=prxPattern;
    %put;
    %end;

    I have fixed the program, repackaged it in a new ZIP file and linked to the new version in the post. If you downloaded before 13:40 Easter Time, you might want to grab the code again...
    Sorry for the inconvenience!
    Mark

      • I am not sure, but I think there is a "%PUT" missing in the second to last line of the code:

        %else %PUT %superq(varlist);

        • SAS Jedi

          Actually, no. See the note above the line to help expalain what's going on:
          /* Otherwise, in PRODUCTION mode - produce the list in-line with the SAS code */
          %else %superq(varlist);

          Because %superq(varlist) executes without being part of a %let or %put statement, the text it produces is pushed on top of the input stack. So in effect, the %superq(varlist) text "disappears" and is replaced with the generated variable list in your code.

          Check it out with this test code:

          data new1;
          set sashelp.class;
          run;
          proc contents data=new1;
          run;

          /* Let's drop of all variables with 'e' as the second letter */
          data new2 (drop=%VARLISTPATTERN(sashelp.class,?e*));
          set sashelp.class;
          run;
          proc contents data=new2;
          run;

Back to Top