Suppose you want a list of car manufacturers from the CARS dataset. Easy! Call the %CHARLIST macro from a %PUT statement, like this:
The CHARLIST macro generates a list of unique values of a selected variable from a selected dataset. So does PROC FREQ. But, if you don't need statistics, the CHARLIST macro is quick and easy to call, and can be called from anywhere.
Suppose you want a list of all available vehicle types to appear in a footnote. Easy! Call the %CHARLIST macro from a FOOTNOTE statement, like this:
proc print data=sashelp.cars(obs=10) noobs; var make model type msrp; title "First 10 rows from CARS dataset"; footnote "Vehicle types include: %charlist(sashelp.cars,type)"; run; |
You can't call PROC FREQ (or any proc) from a FOOTNOTE statement!
The CHARLIST macro:
%macro charlist(dsn,var) / minoperator mindelimiter=','; %local dsid rc list i; %let dsid=%sysfunc(open(&dsn(keep=&var))); %syscall set(dsid); %let rc=%sysfunc(fetch(&dsid)); %let list=&&&var; %do i=2 %to %sysfunc(attrn(&dsid,nlobs)); %let rc=%sysfunc(fetch(&dsid)); %if not (%bquote(&&&var) in &list) %then %let list=&list, %bquote(%trim(&&&var)); %end; %let dsid=%sysfunc(close(&dsid)); &list %mend charlist; |
- The OPEN function opens SASHELP.CARS, keeping only the variable MAKE.
- The SET routine creates a macro variable MAKE from the dataset variable MAKE.
- The NLOBS argument of the ATTRN function returns the number of observations in a dataset, which serves as the stop value of the %DO loop.
- The FETCH function reads observations from SASHELP.CARS, populating the macro variable MAKE.
- If the current MAKE is not already in the macro variable LIST, it is appended to LIST.
Be sure to save the CHARLIST macro in your stored compiled macro or autocall library for convenient re-use!
The CARFINDER macro calls the CHARLIST macro. The CARFINDER macro accepts a CAR parameter (vehicle make) which is referenced in the WHERE and TITLE statements. More importantly, the CAR parameter is validated against a list of all car makes in the SASHELP.CARS dataset, to insure a valid car make is selected. If so, the CARFINDER macro issues a PROC PRINT. If a valid car make is not selected, the CARFINDER macro issues an error message, followed by a list of valid car makes to choose from. This is called data-driven parameter validation.
%macro carfinder(car) / minoperator mindelimiter=','; %if &car in %charlist(sashelp.cars,make) %then %do; proc print data=sashelp.cars noobs; var model type msrp; where make="&car"; title "&car"; run; %end; %else %do; %put ERROR: No &car.s.; %put NOTE: Cars include %charlist(sashelp.cars,make).; %end; %mend carfinder; |
Did you spot the inefficiency? The CARFINDER macro is inefficient because it potentially calls the CHARLIST macro twice, which then reads the CARS dataset twice. Avoiding this inefficiency is easy. Call the CARFINDER macro only once, from a %LET statement, to create the macro variable CARMAKES, which stores the list of valid car makes. Then reference the macro variable CARMAKES as needed.
%macro carfinder(car) / minoperator mindelimiter=','; %local carmakes; %let carmakes=%charlist(sashelp.cars,make); %if &car in &carmakes %then %do; proc print data=sashelp.cars noobs; var model type msrp; where make="&car"; title "&car"; run; %end; %else %do; %put ERROR: No &car.s.; %put NOTE: Cars include &carmakes.; %end; %mend carfinder; |
Happy car finding!
25 Comments
The first variable I tried this on (FETAL_OUTCOME) failed because one value contained an ampersand (the abbreviation for labor and delivery):
LIVEBORN
STILLBORN - ALIVE ON L&D DELIVERY
STILLBORN - DEMISE PRIOR TO ADMISSION
Could anyone share the fix for this situation?
Thanks,
Sara
Don't forget to add &var to the %local statement.
I prefer the SQL version for the reasons stated. It will be faster for indexed columns, as I assume the SQL engine just reads the index, not the table.
.
For me readability and therefore maintainability wins every time. The option to embed is a minor advantage. Now if SAS could be made to work like a proper programming language without a macro processor... Only dreaming!
Thanks for sharing, good job.
Wow, really helpful. I applied the car finder macro to apply to any variable within any SAS dataset:
%macro varfinder (dataset=,checkvar=,listvar=) / minoperator mindelimiter=',';
%local ourlist;
%let ourlist=%charlist(&dataset,&listvar);
%if &checkvar in &ourlist %then %do;
proc print data=&dataset noobs;
where &listvar="&checkvar";
title "&checkvar listings";
run;
%end;
%else %do;
%put ERROR: &checkvar not in list;
%put NOTE: ourlist include &ourlist;
%end;
%mend varfinder;
Nice generalization! Thanks, Claire, for sharing.
Wow, it's pretty special how you can call this macro from anywhere.
FYI it can break if the data includes commas, such as:
%put %charlist(sashelp.baseball, name);
To answer the conversation yesterday, SQL is faster for large data. This took about a minute:
%put %charlist(maps.world, segment);
This took a fraction of a second:
proc sql noprint;
select distinct segment into :seglist separated by ','
from maps.world;
quit;
%put &seglist.;
Thank you, Brian, for comparing efficiency, which turns out to be a major consideration. If you have time, you might try FETCHOBS instead of FETCH, and let us know how that compares.
The NAME column in sashelp.baseball presents a special challenge due to embedded commas and apostrophes in names.
I handled that by changing the delimiter from a comma to an asterisk and using %SUPERQ, like this:
%macro charlist(dsn,var) / minoperator mindelimiter='*';
%local dsid rc list i;
%let dsid=%sysfunc(open(&dsn(keep=&var)));
%syscall set(dsid);
%let rc=%sysfunc(fetch(&dsid));
%let list=%qtrim(%superq(&var));
%do i=2 %to %sysfunc(attrn(&dsid,nlobs));
%let rc=%sysfunc(fetch(&dsid));
%if %superq(&var) ne %str( ) %then %if not (%superq(&var) in &list) %then %let list=&list*%qtrim(%superq(&var));
%end;
%let dsid=%sysfunc(close(&dsid));
&list
%mend charlist;
Nice. I tried FETCHOBS and it wasn't any faster. It's probably 10% slower, even.
Good to know. Thanks, Brian!
I would argue that Leonid's version is more readable and maintainable, and of course could be made into a macro. I do wonder if the %charlist macro as presented sees better performance than Leonid's PROC SQL example. But unless there is a noticeable performance gap, I think the more readable proc sql version is better for clean code that is easy to understand.
Either way, it's a neat macro function that I could see coming in handy. Nice work!
Thank you, Paul. I agree that code should be readable and maintainable. Leonid's code is concise, straightforward, and intuitive. And, for convenience, it can easily be made into a macro, as you noted. It would be interesting if somebody compared the two against a large dataset to note any efficiency difference.
My "pure" macro code offers one advantage: It can be called "in-line" from anywhere, for example, from any macro language statement or expression (%PUT, %LET, %IF). It can also be called from SAS code (TITLE or FOOTNOTE statements), while a macro that generates SAS code does not offer that flexibility. The macro variable created by PROC SQL can, of course, be referenced anywhere. But there is the overhead of having to call the macro first to create the macro variable, and the overhead of storing the macro variable in the symbol table. Regardless, both are fine choices with their own advantages.
Good point! I hadn't thought of that as a disadvantage of a sql query.
Yes, exactly. Thank you, Leonid.
Hi Jim, nice macro implementation!
Here is a non-macro version of it:
proc sql noprint;
select distinct make into : list separated by ', '
from sashelp.cars;
quit;
%put &list;
Thank you, nice macro
Please note that this macro only works SAS-datasets, not with database-tables
Thank you, Jan. Did you try pointing to a database table with a LIBNAME statement before calling the macro?
Hi Jim,
Very cool! I can get this to work with a simple test dataset. But when I try it with a permanent dataset that we have, I get this error:
SYMBOLGEN: Macro variable DSN resolves to sasnet.profile_plus_h
SYMBOLGEN: Macro variable VAR resolves to perinatal_c
671 %put %charlist(sasnet.profile_plus_h, perinatal_c);
SYMBOLGEN: Macro variable DSID resolves to 15
SYMBOLGEN: && resolves to &.
SYMBOLGEN: Macro variable VAR resolves to perinatal_c
SYMBOLGEN: Macro variable PERINATAL_C resolves to
SYMBOLGEN: Macro variable DSID resolves to 15
SYMBOLGEN: Macro variable DSID resolves to 15
SYMBOLGEN: && resolves to &.
SYMBOLGEN: Macro variable VAR resolves to perinatal_c
SYMBOLGEN: Macro variable PERINATAL_C resolves to Level II
SYMBOLGEN: Macro variable LIST resolves to
ERROR: Operand missing for IN operator in argument to %EVAL function.
ERROR: The macro CHARLIST will stop executing.
The variable PERINATAL_C is a character variable with a length of 9. The values of PERINATAL_C have imbedded blanks. Could that be the problem?
Hey David. Thanks for posting. Imbedded blanks should not be a problem. My guess right now ... PERINATAL_C has a missing value on the first row. I'll check further and let you know what I find!
Yup. A missing value on any row is a problem. To fix, modify the %IF like this:
%if &&&var ne %str( ) %then %if not (%bquote(&&&var) in &list) %then %let list=&list, %bquote(%trim(&&&var));
Thanks, David, for reporting!
Need %BQUOTE!
%if %bquote(&&&var) ne %str( ) %then %if not (%bquote(&&&var) in &list) %then %let list=&list, %bquote(%trim(&&&var));
Hi, really nice and really useful. Thanks.
Thank you, Mike.
Why does this line need 3"&" - %let list=&&&var;?
Adraine, thank you for asking. I should have added this to my bullet list of explanations.
A single ampersand is a direct reference to a macro variable.
Multiple ampersands represent an indirect reference.
Three ampersands cause the reference to be scanned twice, like this:
original reference: &&&var
after 1st scan: &make (&& resolves to &, &var resolves to make)
after 2nd scan: Acura