Jedi SAS Tricks: Building a Name Suffix variable list

18

If you’ve taken one of my SAS classes you may recall “Mark’s 3 rules of programming”, the first of which is “Lazy programmers are GOOD programmers.”  One of the things I love about best about SAS programming is the plethora of functions and shortcuts built into the language, all designed to make it easier and quicker for me to accomplish my work - just great for lazy guys like me! SAS variable lists are a prime example - I use them all the time.  Arguably my favorite variable list is the name prefix list which allows easy listing of all variable names beginning with the same text pattern.  Recently a SAS student, enthused upon discovering the name prefix list, asked “How about variables with names ending with the same characters?  Is there a name suffix variable list for that?”  I immediately thought  “How useful that would be!” but alas, there is no “name suffix list” built into SAS.  Sounds like a great suggestion for the annual SASWare Ballot!  Of course, if you write a blog series called “Jedi SAS Tricks” you can’t just leave a challenge like that unanswered… Let me show you what I mean!  The examples use SASHELP.Iris data.  Here’s a small sample:
SASHELP.Iris Data Set

To print a report containing the Species name and all the Sepal information, we could write the following program:

proc print data=sashelp.iris noobs;
   var Species Sepal:;
run;

SAS interprets “Sepal:” as “all variables with names that begin with ‘Sepal’.” The resulting report is just what I intended, with minimal typing required on my part:report using Name Prefix varialbe list

Being fond of SAS Macro, I set myself a goal to be able to write a program like this:
%let DSN=sashelp.iris;
proc print data=&dsn noobs;
   var Species %Suffix(&dsn,Length);
run;
The program would then produce a report like this:
report using %Suffix macro to produce the variable list.
Now, usually I’d jump right into PROC SQL and use the Dictionary tables to extract the list of variables.  Something like this would do the trick:
proc sql noprint;
select Name
   into :VarList separated by ' '
   from dictionary.columns
   where libname='SASHELP' and memname='IRIS'
     and name like '%Length'
  ;
quit;
%PUT The variable list contains: &VarList;
That produces the list I want, but because I want to be able to use this macro “in line” with other SAS code, the macro can’t generate SAS code to obtain the list values.   Luckily, I remembered the metadata view SASHELP.VCOLUMN contains the same information.  Now, to write code that can read from that view directly in macro, without generating DATA step code.  The trick will be in judicious use of %SYSFUNC and the SAS File I/O functions EXIST, OPEN, FETCH, VARNUM, GETVARC and CLOSE.  After some experimenting, here’s what I came up with:
%Macro Suffix(DSN,Suffix);
   %local dsid varlist rc whr lib ds;
   /* Separate Libref from the data set name for WHERE clause*/
   %let ds=%QSCAN(%QUPCASE(%SUPERQ(DSN)),2);
   %if &ds= %then %do;
      %let lib=WORK;
      %let ds=%QUPCASE(%SUPERQ(DSN));
   %end;
   %else %let lib=%qscan(%QUPCASE(%SUPERQ(DSN)),1);
   %if not %sysfunc(exist(&lib..&ds)) %then %do;
      /* Specified data set does not exist */
      %PUT ERROR: Dataset %SUPERQ(DSN) does not exist.;
      %PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
      %GoTo EndMacro;
   %end;
   /* Open the dataset and find out how many variable names */
   %let dsid=%sysfunc(open(&dsn,i)); 
   %if &dsid=0 %then %do;
      %put %sysfunc(sysmsg()); 
      %GoTo EndMacro;
   %end;
   %let NumVars=%sysfunc(attrn(&dsid,NVARS));
   /* Read all the variable names, keep the ones that match the pattern */
   %let VarList=;
   %do i=1 %to &NumVars;
      %let ThisVar=%sysfunc(varname(&dsid,&i));
      %if %sysfunc(prxmatch(/\w*%QUPCASE(&Suffix)$/,%QUPCASE(&ThisVar))) %then 
         %let VarList=&VarList &ThisVar;
   %end;
   /* Return the variable names to the input stack by resolving VARLIST */
   &varlist
%Endmacro:
   /* Close SASHELP.VCOLUMN */
   %let rc=%sysfunc(close(&dsid)); 
%mend Suffix;
That worked JUST as I wanted!  Because I thought this was a “keeper”, I added some parameter error checking / documentation to the the macro and saved it with the rest of my personal SAS macros.  And, of course, I'll share it with you here on my blog - full code for my final version of the macro follows below.  As usual, after compiling the macro you can get syntax help by submitting %suffix(!HELP).
Until the next time, may the SAS be with you!
Mark

%Macro Suffix(DSN,Suffix);
   %local dsid varlist rc whr lib ds;
   /* Self documentation */
   %if (&DSN= and &Suffix= ) %then %let DSN=!HELP;
   %if %qupcase(%qsubstr(&DSN,1,5))=!HELP 
   %then %do;
      %PUT NOTE: EndsWith Macro Help ********************************;
   %put;
   %PUT NOTE- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
   %put;
   %PUT NOTE- DSN: Name of dataset containing vairable names of interest;
   %PUT NOTE-      %NRSTR(Default is &SYSLAST);
   %put;
   %PUT NOTE- Suffix: Text with which the variable names end (required);
   %PUT NOTE- Examples: ;
   %PUT NOTE- Get names of variables that end with "ght" from sashelp.class: ;
   %PUT NOTE-    %NRSTR(%%EndsWith%(sashelp.class,ght%));
   %put;
   %PUT NOTE: End EndsWith Macro Help ****************************;
   %return;
   %end;
   /* Parameter Validation */
   %if &suffix= %then %do;
      /* Suffix is required */
      %PUT ERROR: You must specify a variable name suffix;
      %PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
      %GoTo EndMacro;
   %end;
   %if &DSN= %then %let DSN=&SYSLAST;
   /* Separate Libref from the data set name for WHERE clause*/
   %let ds=%QSCAN(%QUPCASE(%SUPERQ(DSN)),2);
   %if &ds= %then %do;
      %let lib=WORK;
      %let ds=%QUPCASE(%SUPERQ(DSN));
   %end;
   %else %let lib=%qscan(%QUPCASE(%SUPERQ(DSN)),1);
   %if not %sysfunc(exist(&lib..&ds)) %then %do;
      /* Specified data set does not exist */
      %PUT ERROR: Dataset %SUPERQ(DSN) does not exist.;
      %PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
      %GoTo EndMacro;
   %end;
   /* Open the dataset and find out how many variable names */
   %let dsid=%sysfunc(open(&dsn,i)); 
   %if &dsid=0 %then %do;
      %put %sysfunc(sysmsg()); 
      %GoTo EndMacro;
   %end;
   %let NumVars=%sysfunc(attrn(&dsid,NVARS));
   /* Read all the variable names, keep the ones that match the pattern */
   %let VarList=;
   %do i=1 %to &NumVars;
      %let ThisVar=%sysfunc(varname(&dsid,&i));
      %if %sysfunc(prxmatch(/\w*%QUPCASE(&Suffix)$/,%QUPCASE(&ThisVar))) %then 
         %let VarList=&VarList &ThisVar;
   %end;
   /* Return the variable names to the input stack by resolving VARLIST */
   &varlist
%Endmacro:
   /* Close SASHELP.VCOLUMN */
   %let rc=%sysfunc(close(&dsid)); 
%mend Suffix;
Share

About Author

SAS Jedi

Principal Technical Training Consultant

Mark Jordan (a.k.a. SAS Jedi) grew up in northeast Brazil as the son of Baptist missionaries. After 20 years as a US Navy submariner pursuing his passion for programming as a hobby, in 1994 he retired, turned his hobby into a dream job, and has been a SAS programmer ever since. Mark writes and teaches a broad spectrum of SAS programming classes, and his book, "Mastering the SAS® DS2 Procedure: Advanced Data Wrangling Techniques" is in its second edition. When he isn’t writing, teaching, or posting “Jedi SAS Tricks”, Mark enjoys playing with his grand and great-grandchildren, hanging out at the beach, and reading science fiction novels. His secret obsession is flying toys – kites, rockets, drones – and though he usually tries to convince Lori that they are for the grandkids, she isn't buying it. Mark lives in historic Williamsburg, VA with his wife, Lori, and Stella, their cat. To connect with Mark, check out his SAS Press Author page, follow him on Twitter @SASJedi or connect on Facebook or LinkedIn.

Related Posts

18 Comments

  1. Also, just to let you know, I think the macro doesn't handle underscores properly. For instance:

    data test ;
    do testa = 1 to 5;
    do testb = 1 to testa;
    test_a = ceil(10*ranuni(1));
    output;
    end;
    end;
    run;

    proc print data=test ;
    var %suffix(test,_a);
    run;

    This will print the values of both "testa" and "test_a" variables, rather than just the desired "test_a" variable.

    • SAS Jedi

      Humm... yep, you are correct! Well, I may be a SAS Jedi, but it looks like I'm not Yoda yet :-)

      The reason for the issues with underscores is my reliance on the LIKE operator in the WHERE clause. I looked at modifying the existing subsetting code, but decided this would be much easier using a PERL regular expression. I also got to thinking about Quentin's comment on alternate methods of getting the variable names, and settled on using a couple more of the SAS File I/O functions to get the names directly from the data set instead of reading the SASHELP.COLUMNS dictionary view. Here is the revised code:

      %Macro Suffix(DSN,Suffix);
      %local dsid varlist rc whr lib ds;
      /* Self documentation */
      %if (&DSN= and &Suffix= ) %then %let DSN=!HELP;
      %if %qupcase(%qsubstr(&DSN,1,5))=!HELP
      %then %do;
      %PUT NOTE: EndsWith Macro Help ********************************;
      %put;
      %PUT NOTE- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
      %put;
      %PUT NOTE- DSN: Name of dataset containing variable names of interest;
      %PUT NOTE- %NRSTR(Default is &SYSLAST);
      %put;
      %PUT NOTE- Suffix: Text with which the variable names end (required);
      %PUT NOTE- Examples: ;
      %PUT NOTE- Get names of variables that end with "ght" from sashelp.class: ;
      %PUT NOTE- %NRSTR(%%EndsWith%(sashelp.class,ght%));
      %put;
      %PUT NOTE: End EndsWith Macro Help ****************************;
      %return;
      %end;
      /* Parameter Validation */
      %if &suffix= %then %do;
      /* Suffix is required */
      %PUT ERROR: You must specify a variable name suffix;
      %PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
      %GoTo EndMacro;
      %end;
      %if &DSN= %then %let DSN=&SYSLAST;
      /* Separate Libref from the data set name for WHERE clause*/
      %let ds=%QSCAN(%QUPCASE(%SUPERQ(DSN)),2);
      %if &ds= %then %do;
      %let lib=WORK;
      %let ds=%QUPCASE(%SUPERQ(DSN));
      %end;
      %else %let lib=%qscan(%QUPCASE(%SUPERQ(DSN)),1);
      %if not %sysfunc(exist(&lib..&ds)) %then %do;
      /* Specified data set does not exist */
      %PUT ERROR: Dataset %SUPERQ(DSN) does not exist.;
      %PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
      %GoTo EndMacro;
      %end;
      /* Open the dataset and find out how many variable names */
      %let dsid=%sysfunc(open(&dsn,i));
      %if &dsid=0 %then %do;
      %put %sysfunc(sysmsg());
      %GoTo EndMacro;
      %end;
      %let NumVars=%sysfunc(attrn(&dsid,NVARS));
      /* Read all the variable names, keep the ones that match the pattern */
      %let VarList=;
      %do i=1 %to &NumVars;
      %let ThisVar=%sysfunc(varname(&dsid,&i));
      %if %sysfunc(prxmatch(/\w*%QUPCASE(&Suffix)$/,%QUPCASE(&ThisVar))) %then
      %let VarList=&VarList &ThisVar;
      %end;
      /* Return the variable names to the input stack by resolving VARLIST */
      &varlist
      %Endmacro:
      /* Close SASHELP.VCOLUMN */
      %let rc=%sysfunc(close(&dsid));
      %mend Suffix;

      • It looks like it's still pulling out variables where the desired suffix string is in the middle of the variable name. However, I figured out a tweak you can make. If you change the line

        %if %sysfunc(prxmatch(/\w*%QUPCASE(&Suffix)/,%QUPCASE(&ThisVar))) %then

        to the following:

        %if "%SUBSTR(%QUPCASE(&ThisVar),(%LENGTH(&ThisVar)-%LENGTH(&Suffix)+1))" = "%QUPCASE(&Suffix)" %then

        then that seems to work. You might need a %QSUBSTR instead of a %SUBSTR, or a %NRQUOTE or %NRBQUOTE somewhere in there, just to make sure you're addressing all possible cases, but it worked as-is with my test data:

        data test ;
        do _stests = 1 to 10;
        do te_st = 1 to _stests;
        do te_st_s = 1 to te_st;
        test_s = ceil(10*ranuni(1));
        output;
        end;
        end;
        end;
        run;

        proc print data=test ;
        var %suffix(test,_s);
        run;

        • SAS Jedi

          My PERL regular expression was missing a "$" to indicate the pattern was to match only at the end of the string. I updated the code in yesterday's post. The resulting macro is much more functional and robust, thanks to your persistence and feedback. Here's the code I used to test the final macro code:
          data test ;
          _stests = 1;
          te_st = 2;
          te_st_s = 3;
          test_s = ceil(10*ranuni(1));
          run;
          title "&SYSLAST";
          proc print;run;

          %let S=_s;
          title "&SYSLAST - suffix &s";
          proc print data=test ;
          var %suffix(test,&s);
          run;

          %let S=st;
          title "&SYSLAST - suffix &s";
          proc print data=test ;
          var %suffix(test,&s);
          run;
          title;

          Muchas gracias, Gabe!

          • I'm not as familiar with PERL syntax... do you have a suggestion on a good resource to learn from? Also, more generally, can you comment on why/when to use PERL over another coding option?

          • SAS Jedi

            PERL regular expressions provide a very powerful and flexible way of describing text pattern in a SAS program. You can directly leverage PERL regular expressions in SAS using the PRX functions. If you deal with a lot of text data, it is well worth the effort it takes to learn. My friend Ron Cody wrote an excellent paper titled An Introduction to Perl Regular Expressions in SAS 9 that can give you a start. I also wrote a SAS User Group workshop called Untangling the World Wide Web (download the PDF here) and pages 23-50 have a relatively simple and practical introduction to using PERL regular expressions in SAS.

            If you are trying to detect a text pattern instead of just a text string, using PERL regular expressions can greatly simplify your code.

  2. if the intention was for it to be able to be run inline within a data step, why include a macro variable for the dataset name (DSN)? Why not just automatically determine what dataset currently is being operated on?

    • SAS Jedi

      Gabe,
      The macro is actually writing part of your data step code for you - the data step itself hasn't even compiled yet. The macro produces the list of variable names and "types them in" at the appropriate spot in the data step code. Only after the entire data step is written and then compiled will the data step actually start to execute. So during the time the macro is executing to produce the list, there is actually no data set currently being operated on. Hence the need to tell the macro which data set to process.

      • Thanks for the reply. I guess now my question is more in general... if there's a way using any of the %sysfunc functions and arguments to determine what dataset is currently being compiled?

        • SAS Jedi

          Gabe, there is no way I know of to detect information from a data step in the middle of compilation. You can use macro language techniques to write the data step - to add text as it is being sent to the data step compiler. And the data step can send information back to the macro symbol table during execution using data step functions. But until the data step is fully compiled, there's no communication path open that goes back to to the macro processor or to the input stack.

  3. Quentin McMullen on

    Nice macro, thanks for posting! Very inspiring. A few thoughts:
    1. If you don't mind passing more work onto the user, might be nice to make this a more general tool by having the second parameter be any logical condition. So the where statement becomes something like:
    %let whr=(WHERE=(LIBNAME="&lib" AND MEMNAME="&ds" and &condition)) ;
    and user can call with whatever they want for the condition, i.e. specify prefix, suffix, length, etc
    %put %suffix(sashelp.class,name like '%ght');
    %put %suffix(sashelp.class,upcase(name) like '%A%');
    %put %suffix(sashelp.class,length(name)<4);

    2. It's possible to do similarly using SCL functions (called by %sysfunc) to open a dataset and review its variables. I think often the SCL route might be faster since dictionary.columns can get VERY big.

    3. At least in my browser, I see "&" in the code instead of &. Could be the blog software doing that conversion for you...

    Thanks again!

    • SAS Jedi

      Thanks for the ideas for extending the macro. It's always a conundrum balancing further generalization against clarity/ease of use... As for the dictionary.coumns, it CAN get big, but WHERE subsetting is pretty efficient. And I'm not sure the SCL function calls would work for those who don't have SAS/AF licensed.

      And even though I inserted "pre" tags, WordPress seems to have replaced my ampersands with "amp" in the code - sigh. I'll figure it out soon, I hope :-)
      Mark

  4. Great tip ... I didn't know I could do that with a Proc Print VAR list.

    Your macro error messages are much more complete than ones I write. Hmm .. maybe I could expand beyond, "You did something wrong, start over." ;-)

Back to Top