Jedi SAS Tricks: Building a Name Suffix variable list

If you’ve taken one of my SAS classes you may recall “Mark’s 3 rules of programming”, the first of which is “Lazy programmers are GOOD programmers.”  One of the things I love about best about SAS programming is the plethora of functions and shortcuts built into the language, all designed to make it easier and quicker for me to accomplish my work - just great for lazy guys like me! SAS variable lists are a prime example - I use them all the time.  Arguably my favorite variable list is the name prefix list which allows easy listing of all variable names beginning with the same text pattern.  Recently a SAS student, enthused upon discovering the name prefix list, asked “How about variables with names ending with the same characters?  Is there a name suffix variable list for that?”  I immediately thought  “How useful that would be!” but alas, there is no “name suffix list” built into SAS.  Sounds like a great suggestion for the annual SASWare Ballot!  Of course, if you write a blog series called “Jedi SAS Tricks” you can’t just leave a challenge like that unanswered… Let me show you what I mean!  The examples use SASHELP.Iris data.  Here’s a small sample:
SASHELP.Iris Data Set

To print a report containing the Species name and all the Sepal information, we could write the following program:

proc print data=sashelp.iris noobs;
   var Species Sepal:;
run;

SAS interprets “Sepal:” as “all variables with names that begin with ‘Sepal’.” The resulting report is just what I intended, with minimal typing required on my part:report using Name Prefix varialbe list

Being fond of SAS Macro, I set myself a goal to be able to write a program like this:

%let DSN=sashelp.iris;
proc print data=&dsn noobs;
   var Species %Suffix(&dsn,Length);
run;

The program would then produce a report like this:
report using %Suffix macro to produce the variable list.
Now, usually I’d jump right into PROC SQL and use the Dictionary tables to extract the list of variables.  Something like this would do the trick:

proc sql noprint;
select Name
   into :VarList separated by ' '
   from dictionary.columns
   where libname='SASHELP' and memname='IRIS'
     and name like '%Length'
  ;
quit;
%PUT The variable list contains: &VarList;

That produces the list I want, but because I want to be able to use this macro “in line” with other SAS code, the macro can’t generate SAS code to obtain the list values.   Luckily, I remembered the metadata view SASHELP.VCOLUMN contains the same information.  Now, to write code that can read from that view directly in macro, without generating DATA step code.  The trick will be in judicious use of %SYSFUNC and the SAS File I/O functions EXIST, OPEN, FETCH, VARNUM, GETVARC and CLOSE.  After some experimenting, here’s what I came up with:

%Macro Suffix(DSN,Suffix);
   %local dsid varlist rc whr lib ds;
   /* Separate Libref from the data set name for WHERE clause*/
   %let ds=%QSCAN(%QUPCASE(%SUPERQ(DSN)),2);
   %if &ds= %then %do;
      %let lib=WORK;
      %let ds=%QUPCASE(%SUPERQ(DSN));
   %end;
   %else %let lib=%qscan(%QUPCASE(%SUPERQ(DSN)),1);
   %if not %sysfunc(exist(&lib..&ds)) %then %do;
      /* Specified data set does not exist */
      %PUT ERROR: Dataset %SUPERQ(DSN) does not exist.;
      %PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
      %GoTo EndMacro;
   %end;
   /* Open SASHELP.VCOLUMN subset for the desired dataset */
   %let whr=(WHERE=(LIBNAME="&lib" AND MEMNAME="&ds" AND UPCASE(NAME)
        like %STR(%')%nrstr(%%)%str(%QUPCASE(%SUPERQ(suffix))%')));
   %let dsid=%sysfunc(open(sashelp.vcolumn&whr));
   %if &dsid=0 %then %do;
      %PUT ERROR: Could not retrieve variable names for &DSN from;
      %put ERROR- sashelp.vcolumn&whr;
      %PUT ERROR- &sysmsg;
      %GoTo EndMacro;
   %end;
   %let rc=0;
   /* Retrieve each variable name observation from SASHELP.VCOLUMN */
   /* and add it to the list of variables                          */
   %do %while (&rc=0);
      %let rc=%sysfunc(fetch(&dsid));
      %if &rc=0 %then %do;
         %let varlist=&varlist %sysfunc(getvarc(
              &dsid,%sysfunc(varnum(&dsid,NAME))));
      %end;
   %end;
   /* Return the variable names to the input stack */
   &varlist
%Endmacro:
   /* Close SASHELP.VCOLUMN */
   %let rc=%sysfunc(close(&dsid));
%mend Suffix;

That worked JUST as I wanted!  Because I thought this was a “keeper”, I added some parameter error checking / documentation to the the macro and saved it with the rest of my personal SAS macros.  And, of course, I'll share it with you here on my blog - full code for my final version of the macro follows below.  As usual, after compiling the macro you can get syntax help by submitting %suffix(!HELP).
Until the next time, may the SAS be with you!
Mark

%Macro Suffix(DSN,Suffix);
   %local dsid varlist rc whr lib ds;
   /* Self documentation */
   %if (&DSN= and &Suffix= ) %then %let DSN=!HELP;
   %if %qupcase(%qsubstr(&DSN,1,5))=!HELP
   %then %do;
      %PUT NOTE: EndsWith Macro Help ********************************;
   %put;
   %PUT NOTE- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
   %put;
   %PUT NOTE- DSN: Name of dataset containing vairable names of interest;
   %PUT NOTE-      %NRSTR(Default is &SYSLAST);
   %put;
   %PUT NOTE- Suffix: Text with which the variable names end (required);
   %PUT NOTE- Examples: ;
   %PUT NOTE- Get names of variables that end with "ght" from sashelp.class: ;
   %PUT NOTE-    %NRSTR(%%EndsWith%(sashelp.class,ght%));
   %put;
   %PUT NOTE: End EndsWith Macro Help ****************************;
   %return;
   %end;
   /* Parameter Validation */
   %if &suffix= %then %do;
      /* Suffix is required */
      %PUT ERROR: You must specify a variable name suffix;
      %PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
      %GoTo EndMacro;
   %end;
   %if &DSN= %then %let DSN=&SYSLAST;
   /* Separate Libref from the data set name for WHERE clause*/
   %let ds=%QSCAN(%QUPCASE(%SUPERQ(DSN)),2);
   %if &ds= %then %do;
      %let lib=WORK;
      %let ds=%QUPCASE(%SUPERQ(DSN));
   %end;
   %else %let lib=%qscan(%QUPCASE(%SUPERQ(DSN)),1);
   %if not %sysfunc(exist(&lib..&ds)) %then %do;
      /* Specified data set does not exist */
      %PUT ERROR: Dataset %SUPERQ(DSN) does not exist.;
      %PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
      %GoTo EndMacro;
   %end;
   /* Open SASHELP.VCOLUMN subset for the desired dataset */
   %let whr=(WHERE=(LIBNAME="&lib" AND MEMNAME="&ds" AND UPCASE(NAME) like
                  %STR(%')%nrstr(%%)%str(%QUPCASE(%SUPERQ(suffix))%')));
   %let dsid=%sysfunc(open(sashelp.vcolumn&whr));
   %if &dsid=0 %then %do;
      %PUT ERROR: Could not retrieve variable names for &DSN from;
      %put ERROR- sashelp.vcolumn&whr;
      %PUT ERROR- &sysmsg;
      %GoTo EndMacro;
   %end;
   %let rc=0;
   /* Retrieve each variable name observation from SASHELP.VCOLUMN */
   /* and add it to the list of variables                          */
   %do %while (&rc=0);
      %let rc=%sysfunc(fetch(&dsid));
      %if &rc=0 %then %do;
         %let varlist=&varlist %sysfunc(getvarc(
              &dsid,%sysfunc(varnum(&dsid,NAME))));
      %end;
   %end;
   /* Return the variable names to the input stack by resolving VARLIST */
   &varlist
%Endmacro:
   /* Close SASHELP.VCOLUMN */
   %let rc=%sysfunc(close(&dsid));
%mend Suffix;
tags: custom, customizing, macro, macros, programming, tips and tricks, utility, variable list

18 Comments

  1. Keith Schleicher
    Posted November 18, 2011 at 4:44 pm | Permalink

    A good and lazy programmer are you, young Jedi.

    • Mark Jordan Mark Jordan
      Posted November 22, 2011 at 11:21 am | Permalink

      In the (almost) words of Paul Simon - "Still lazy, after all these years..."

  2. Posted November 19, 2011 at 4:58 pm | Permalink

    Great tip ... I didn't know I could do that with a Proc Print VAR list.

    Your macro error messages are much more complete than ones I write. Hmm .. maybe I could expand beyond, "You did something wrong, start over." ;-)

    • Mark Jordan Mark Jordan
      Posted November 22, 2011 at 11:22 am | Permalink

      LOL - better that then the typical "you submit your code and nothing happens" ....

  3. Quentin McMullen
    Posted November 21, 2011 at 10:13 am | Permalink

    Nice macro, thanks for posting! Very inspiring. A few thoughts:
    1. If you don't mind passing more work onto the user, might be nice to make this a more general tool by having the second parameter be any logical condition. So the where statement becomes something like:
    %let whr=(WHERE=(LIBNAME="&lib" AND MEMNAME="&ds" and &condition)) ;
    and user can call with whatever they want for the condition, i.e. specify prefix, suffix, length, etc
    %put %suffix(sashelp.class,name like '%ght');
    %put %suffix(sashelp.class,upcase(name) like '%A%');
    %put %suffix(sashelp.class,length(name)<4);

    2. It's possible to do similarly using SCL functions (called by %sysfunc) to open a dataset and review its variables. I think often the SCL route might be faster since dictionary.columns can get VERY big.

    3. At least in my browser, I see "&" in the code instead of &. Could be the blog software doing that conversion for you...

    Thanks again!

    • Mark Jordan Mark Jordan
      Posted November 22, 2011 at 11:20 am | Permalink

      Thanks for the ideas for extending the macro. It's always a conundrum balancing further generalization against clarity/ease of use... As for the dictionary.coumns, it CAN get big, but WHERE subsetting is pretty efficient. And I'm not sure the SCL function calls would work for those who don't have SAS/AF licensed.

      And even though I inserted "pre" tags, WordPress seems to have replaced my ampersands with "amp" in the code - sigh. I'll figure it out soon, I hope :-)
      Mark

  4. Posted November 21, 2011 at 12:16 pm | Permalink

    "Efficiency is intelligent laziness" - David Dunham

    • Mark Jordan Mark Jordan
      Posted November 22, 2011 at 11:15 am | Permalink

      Great quote. Im too lazy to make one up myself, so I think I'll steal this one :-)

  5. Gabe
    Posted November 27, 2011 at 12:02 am | Permalink

    if the intention was for it to be able to be run inline within a data step, why include a macro variable for the dataset name (DSN)? Why not just automatically determine what dataset currently is being operated on?

    • Mark Jordan Mark Jordan
      Posted November 27, 2011 at 12:19 pm | Permalink

      Gabe,
      The macro is actually writing part of your data step code for you - the data step itself hasn't even compiled yet. The macro produces the list of variable names and "types them in" at the appropriate spot in the data step code. Only after the entire data step is written and then compiled will the data step actually start to execute. So during the time the macro is executing to produce the list, there is actually no data set currently being operated on. Hence the need to tell the macro which data set to process.

      • Gabe
        Posted November 28, 2011 at 11:33 am | Permalink

        Thanks for the reply. I guess now my question is more in general... if there's a way using any of the %sysfunc functions and arguments to determine what dataset is currently being compiled?

        • Mark Jordan Mark Jordan
          Posted November 30, 2011 at 12:50 pm | Permalink

          Gabe, there is no way I know of to detect information from a data step in the middle of compilation. You can use macro language techniques to write the data step - to add text as it is being sent to the data step compiler. And the data step can send information back to the macro symbol table during execution using data step functions. But until the data step is fully compiled, there's no communication path open that goes back to to the macro processor or to the input stack.

  6. Gabe
    Posted November 28, 2011 at 12:00 pm | Permalink

    Also, just to let you know, I think the macro doesn't handle underscores properly. For instance:

    data test ;
    do testa = 1 to 5;
    do testb = 1 to testa;
    test_a = ceil(10*ranuni(1));
    output;
    end;
    end;
    run;

    proc print data=test ;
    var %suffix(test,_a);
    run;

    This will print the values of both "testa" and "test_a" variables, rather than just the desired "test_a" variable.

    • Mark Jordan Mark Jordan
      Posted November 28, 2011 at 2:02 pm | Permalink

      Humm... yep, you are correct! Well, I may be a SAS Jedi, but it looks like I'm not Yoda yet :-)

      The reason for the issues with underscores is my reliance on the LIKE operator in the WHERE clause. I looked at modifying the existing subsetting code, but decided this would be much easier using a PERL regular expression. I also got to thinking about Quentin's comment on alternate methods of getting the variable names, and settled on using a couple more of the SAS File I/O functions to get the names directly from the data set instead of reading the SASHELP.COLUMNS dictionary view. Here is the revised code:

      %Macro Suffix(DSN,Suffix);
      %local dsid varlist rc whr lib ds;
      /* Self documentation */
      %if (&DSN= and &Suffix= ) %then %let DSN=!HELP;
      %if %qupcase(%qsubstr(&DSN,1,5))=!HELP
      %then %do;
      %PUT NOTE: EndsWith Macro Help ********************************;
      %put;
      %PUT NOTE- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
      %put;
      %PUT NOTE- DSN: Name of dataset containing variable names of interest;
      %PUT NOTE- %NRSTR(Default is &SYSLAST);
      %put;
      %PUT NOTE- Suffix: Text with which the variable names end (required);
      %PUT NOTE- Examples: ;
      %PUT NOTE- Get names of variables that end with "ght" from sashelp.class: ;
      %PUT NOTE- %NRSTR(%%EndsWith%(sashelp.class,ght%));
      %put;
      %PUT NOTE: End EndsWith Macro Help ****************************;
      %return;
      %end;
      /* Parameter Validation */
      %if &suffix= %then %do;
      /* Suffix is required */
      %PUT ERROR: You must specify a variable name suffix;
      %PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
      %GoTo EndMacro;
      %end;
      %if &DSN= %then %let DSN=&SYSLAST;
      /* Separate Libref from the data set name for WHERE clause*/
      %let ds=%QSCAN(%QUPCASE(%SUPERQ(DSN)),2);
      %if &ds= %then %do;
      %let lib=WORK;
      %let ds=%QUPCASE(%SUPERQ(DSN));
      %end;
      %else %let lib=%qscan(%QUPCASE(%SUPERQ(DSN)),1);
      %if not %sysfunc(exist(&lib..&ds)) %then %do;
      /* Specified data set does not exist */
      %PUT ERROR: Dataset %SUPERQ(DSN) does not exist.;
      %PUT ERROR- SYNTAX: %NRSTR(%endswith%(DSN,Suffix%));
      %GoTo EndMacro;
      %end;
      /* Open the dataset and find out how many variable names */
      %let dsid=%sysfunc(open(&dsn,i));
      %if &dsid=0 %then %do;
      %put %sysfunc(sysmsg());
      %GoTo EndMacro;
      %end;
      %let NumVars=%sysfunc(attrn(&dsid,NVARS));
      /* Read all the variable names, keep the ones that match the pattern */
      %let VarList=;
      %do i=1 %to &NumVars;
      %let ThisVar=%sysfunc(varname(&dsid,&i));
      %if %sysfunc(prxmatch(/\w*%QUPCASE(&Suffix)$/,%QUPCASE(&ThisVar))) %then
      %let VarList=&VarList &ThisVar;
      %end;
      /* Return the variable names to the input stack by resolving VARLIST */
      &varlist
      %Endmacro:
      /* Close SASHELP.VCOLUMN */
      %let rc=%sysfunc(close(&dsid));
      %mend Suffix;

      • Gabe
        Posted November 28, 2011 at 10:15 pm | Permalink

        It looks like it's still pulling out variables where the desired suffix string is in the middle of the variable name. However, I figured out a tweak you can make. If you change the line

        %if %sysfunc(prxmatch(/\w*%QUPCASE(&Suffix)/,%QUPCASE(&ThisVar))) %then

        to the following:

        %if "%SUBSTR(%QUPCASE(&ThisVar),(%LENGTH(&ThisVar)-%LENGTH(&Suffix)+1))" = "%QUPCASE(&Suffix)" %then

        then that seems to work. You might need a %QSUBSTR instead of a %SUBSTR, or a %NRQUOTE or %NRBQUOTE somewhere in there, just to make sure you're addressing all possible cases, but it worked as-is with my test data:

        data test ;
        do _stests = 1 to 10;
        do te_st = 1 to _stests;
        do te_st_s = 1 to te_st;
        test_s = ceil(10*ranuni(1));
        output;
        end;
        end;
        end;
        run;

        proc print data=test ;
        var %suffix(test,_s);
        run;

        • Mark Jordan Mark Jordan
          Posted November 29, 2011 at 10:08 am | Permalink

          My PERL regular expression was missing a "$" to indicate the pattern was to match only at the end of the string. I updated the code in yesterday's post. The resulting macro is much more functional and robust, thanks to your persistence and feedback. Here's the code I used to test the final macro code:
          data test ;
          _stests = 1;
          te_st = 2;
          te_st_s = 3;
          test_s = ceil(10*ranuni(1));
          run;
          title "&SYSLAST";
          proc print;run;

          %let S=_s;
          title "&SYSLAST - suffix &s";
          proc print data=test ;
          var %suffix(test,&s);
          run;

          %let S=st;
          title "&SYSLAST - suffix &s";
          proc print data=test ;
          var %suffix(test,&s);
          run;
          title;

          Muchas gracias, Gabe!

          • Gabe
            Posted November 29, 2011 at 12:37 pm | Permalink

            I'm not as familiar with PERL syntax... do you have a suggestion on a good resource to learn from? Also, more generally, can you comment on why/when to use PERL over another coding option?

          • Mark Jordan Mark Jordan
            Posted November 29, 2011 at 5:54 pm | Permalink

            PERL regular expressions provide a very powerful and flexible way of describing text pattern in a SAS program. You can directly leverage PERL regular expressions in SAS using the PRX functions. If you deal with a lot of text data, it is well worth the effort it takes to learn. My friend Ron Cody wrote an excellent paper titled An Introduction to Perl Regular Expressions in SAS 9 that can give you a start. I also wrote a SAS User Group workshop called Untangling the World Wide Web (download the PDF here) and pages 23-50 have a relatively simple and practical introduction to using PERL regular expressions in SAS.

            If you are trying to detect a text pattern instead of just a text string, using PERL regular expressions can greatly simplify your code.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <p> <pre lang="" line="" escaped=""> <q cite=""> <strike> <strong>