SAS author’s tip: A “most common” macro error

Robert Virgile

Robert Virgile

This week’s author tip is from Robert Virgile and his book SAS Macro Language Magic: Discovering Advanced Techniques.  Virgile chose this tip because even good programmer’s make errors.

We hope you find this tip useful. You can also read an excerpt from Virgile’s book.

Even good programmers make errors.  In fact, when it comes to macro language, good programmers often make the same error:  failing to define macro variables as %local.

Here is an all-too-common example:

%macro doses;
   %do i=1 %to 3;
      dose&i date&i
%mend doses;

This straightforward macro generates a list of variable names.  For example, PROC PRINT could call it:

proc print data=patients;
   var patient %doses;

The intention is to complete the VAR statement:

proc print data=patients;
   var patient dose1 date1 dose2 date2 dose3 date3;

The macro works, and the result is correct.  So where is the error?  Why should this macro add just before the %DO loop:

%local i;

Danger abounds!  But to understand why, you need to understand when and where the software creates &i.  To begin, consider this test:

%let i=5;
%macro test;
   %local i;
   %let i=2;
%mend test;
%put i is now &i..;

When %LET appears outside of a macro definition, it creates &i in the global symbol table.  When %TEST executes, the %LOCAL statement creates &i in the local symbol table for %TEST.  So there are two macro variables, both named &i.

When the interior %LET statement executes, which &i receives a value of 2?  The interior %LET statement triggers a search for a macro variable named &i, and uses the first one it finds.  The search process begins in the local symbol table, which does in fact contain &i.  So the local &i receives a value of 2.

When the macro finishes executing, its local symbol table vanishes.  The %PUT statement writes:

i is now 5.

The only remaining macro variable named &i is the one in the global symbol table, which still has a value of 5.

Next, consider a similar example that removes the %LOCAL statement:

%let i=5;
%macro test;
   %let i=2;
%mend test;
%put i is now &i..;

The process doesn’t change, but the result does.  The %PUT statement writes:

i is now 2.

The key step that reverses the outcome is the interior %LET statement.  As before, it triggers a search for an existing macro variable named &i.  It searches unsuccessfully in the local symbol table, so it continues the search in the global symbol table.  Finding &i there, that &i receives a value of 2.

While the scenarios can become more complex (when %A calls %B, and %B calls %C, and %C calls %D), the search process remains the same.  When a program references a macro variable, the software searches for an existing variable with that name.  Search from the most interior symbol table outward, ending up with the global symbol table if necessary.  The software uses the first macro variable found that has the proper name.  If none of the symbol tables contains the macro variable, create it.  Where?  That is another story for another day.  The search process tells us all we need to know to understand why the %LOCAL statement is important.

The missing %LOCAL statement causes trouble when you write a macro that calls %DOSES.  Here is a simple example:

%macro your_macro;
   %local i;
   %do i=1 %to 3;
      proc print data=patient_pool&i;
         var patient %doses;
%mend your_macro;

The intent is to print three data sets.  The result is quite different, even though you were quite careful to add your own %LOCAL statement.

When %YOUR_MACRO executes, the %LOCAL statement creates &i in its local symbol table.  When %DOSES looks for an existing macro variable named &i, it follows the standard procedure:

  • Search the local %DOSES symbol table (unsuccessfully, because %DOSES is missing the %LOCAL statement).
  • Search the local %YOUR_MACRO symbol table (successfully).
  • Use &i in the local %YOUR_MACRO symbol table.

Since %DOSES increments &i in %YOUR_MACRO’s symbol table, %YOUR_MACRO prints only one data set.  After that, &i exceeds the upper %DO loop limit of 3, and %YOUR_MACRO finishes.  No error messages or warning messages appear … but only one data set prints.

With a slight change, the problem gets worse:

%macro your_macro;
   %local i;
   %do i=1 %to 8;
      proc print data=patient_pool&i;
         var patient %doses;
%mend your_macro;

The same search process applies, with %DOSES using &i from the %YOUR_MACRO symbol table.  This means that the program is an infinite loop!  Each time %DOSES executes, it resets &i so that it never reaches a value of 8.

The bottom line:  define your local variables using a %LOCAL statement.  If you don’t, you can suffer wrong results, infinite loops, and a reputation for writing macros that are dangerous.

For more information about the macro language and the magic you can create with it, check out Robert Virgile’s book SAS Macro Language Magic: Discovering Advanced Techniques.

Post a Comment

Using ‘options’ in the SAS Enterprise Guide query builder

Tucked in the SAS Enterprise Guide Query Builder there is a text box unhelpfully labelled 'Options'. To find it select Options -> Options for this query -> General, and it is about halfway down the screen.

I am going to show you how to use that text box to make your tables smaller, and how to create tables ready for faster access.

First of all, what are these options? Well, in SAS programming when you refer to a table (Data Set) you can follow the table name with a list of options enclosed in brackets. It works consistently everywhere, including PROC SQL.

Entering Data Set Options in the Options box adds them to the PROC SQL generated by the query builder.

Making Tables Smaller

If a table is smaller it will take up less valuable storage space, and it will take less valuable time to read.

This will work best on tables with character columns containing lots of repeated characters, typically spaces. In the Options box type in the option COMPRESS=YES. After you have run the Query look in the log and you will see how much you have saved. If nothing has been saved, or it has increased the size, which can happen, then remove the option and re-run.

There may be a small increase in CPU, but the disc IO savings should outweigh that.

With other types of data try COMPRESS=BINARY. This will work better with numeric columns, large rows, or data with repeated patterns. Again a quick look at the log will tell you if the compression has worked.

It is possible to set compression as a system or library option. If you look at the log and you see it is compressing without you asking for it you can turn it off with COMPRESS=NO.

Getting Ready for Faster Access

When you are reading a non-fiction book you will often find an index at the back which enables you to go straight to a page or pages to find out about a particular subject. Really useful for finding small bits of information from a large volume.

When you are creating tables in SAS you can get SAS to create an index to help find your data quickly. As a rule of thumb the smaller the amount of data you want to get, and the larger the table, the more likely it is that an index will be used. Typically an index might be used if you are filtering your data or if you are joining tables together.

In the options box simply type INDEX=(columnname) where columnname is name of the column you want an index built on. For example INDEX=(CUST_ID) would build and index on cust_id column. If in a later task you filter on cust_id, or join to the table via cust_id, it should be faster.

You can create indices on more than one column, either individually or combined into a composite index.

You can combine compression and indexing in the same table just type both options in the Options box.

To find out about more advanced indexing and to see a list of other options go to They are not all still useful, some refer to tape, but there are some interesting things to try.

I hope that you find this tip useful. You will find more details on using 'options' in SAS Enterprise Guide in my book, SAS® Programming for Enterprise Guide® Users, Second Edition.

Post a Comment

The lowdown on the downloads would bet that the most popular four-letter-word in the English language; the one that gets the most attention; the one that everybody loves to see; the one that makes people the happiest is: free.

Everybody likes to get something for free. We love it when we purchase two garments at the regular price and get the third one for free. We smile when after purchasing nine coffees with our customer loyalty card we get the tenth one for free. We enjoy racking up those frequent flyer miles so that we can get a free flight or a free upgrade. We happily grab that toothpaste bundle that has a free smaller tube or a free toothbrush attached. In the store, we reach for that big bag of chips that has 10% more included for a limited time for free. And, so on.

The good news for SAS programming professionals is that SAS offers a number of value-added products that you can download from the web site for--you guessed it--free.

The main SAS downloads page can be found at this link: From that web page, you can explore downloads for the various SAS products that you have licensed to see what freebies you may be able to bring to bear on your own programming work.

Here are some of my favorite free downloads for Base SAS Software:

You can do a lot with just that handful of free downloads for Base SAS software.

Check out the free SAS downloads and see which ones may end up saving you time at work. No need to hurry though, because unlike those deals at your supermarket, drugstore, local clothing store, or airline, the SAS downloads are not "free for a limited time only." All that you need to exploit them is the knowledge that they are there, a couple of minutes to download them, and the smarts to use them to improve the utility of your SAS programs.

Best of luck in all your SAS endeavors!

Post a Comment

SAS author’s tip: Getting the macro language to perform a %DO loop over a list of values

This week’s author tip is from Robert Virgile and his book “SAS Macro Language Magic: Discovering Advanced Techniques”.  Virgile chose this tip because discovering and developing this technique will help you make the most of MACROS.

We hope you find this tip useful. You can also read an excerpt from Virgile’s book.

Macros Tip #1: A “Most Commonly Asked” Question

One macro language question arises again and again.  How can I get macro language to perform a %DO loop over a list of values?  For example, this syntax is illegal in macro language:

%do name=Amy, Bob, Fred;

But what if that is exactly what you need:  execute a %DO loop for each name on the list?  Macro language lets you work around this limitation, as long as you program the logic yourself.  Here are the steps you’ll need to take.

First, create a macro variable holding the list of names.  A %LET statement might look like this:

%let name_list = Amy Bob Fred;

Of course, there are many ways to generate a list of values.  You could pass it as a parameter:

%call_my_macro (name_list=Amy Bob Fred)

Or, if a data set stores the names, SQL can retrieve them into a macro variable:

proc sql noprint;
  select distinct name into : name_list separated by ‘ ‘ from my.dataset;

Once the list of names is ready, execute the %DO loop for each name.  A few considerations come into play here:

  • While a variety of approaches exist, none of them take very long to execute. It might be advisable to select a simple method, rather than a faster-running method.
  • All of these statements must appear within a macro definition. %DO loops and %LOCAL statements are not permitted elsewhere.

Begin with a more complex approach:

%local i next_name;
%let i=1;
%do %while (%scan(&name_list, &i) ne );
   %let next_name = %scan(&name_list, &i);
   %** DO whatever needs to be done for &NEXT_NAME;
   %let i = %eval(&i + 1);

When &i is 1, %SCAN assigns the first word in &NAME_LIST as the value of &NEXT_NAME.  When it is 2, %SCAN assigns the second word.  After %SCAN reads all the words in &NAME_LIST, it returns a null value and the %DO loop ends.  Is the code difficult to read?  As always, beauty is in the eye of the beholder.

An alternative approach has macro language count the number of names in the list.  While entire macros have been written for this purpose, advances in macro language make counting words a straightforward task:

%local i next_name;
%do i=1 %to %sysfunc(countw(&name_list));
   %let next_name = %scan(&name_list, &i);
   %** DO whatever needs to be done for &NEXT_NAME;

The DATA step function COUNTW counts the number of words in a list, and %SYSFUNC permits macro language to use most DATA step functions.

Of course, complications arise.  Will your logic still work when:

  • The list is empty?
  • An item on the list is a significant word in macro language? For example, what if &NEXT_NAME is NE or OR?

Those issues can arise, but are beyond the scope of this tip.  This tip is to understand the steps needed to instruct macro language to process each item on a list of values.

For more information about the macro language and the magic you can create with it, check out Robert Virgile’s book “SAS Macro Language Magic: Discovering Advanced Techniques”.

Post a Comment

Why statisticians need strong communication skills

PeopleWorkingThis year the American Statistical Association Conference on Statistical Practice (CSP) has some weighty themes including Big Data Prediction and Analysis and all of its exciting applications. But just as important is the theme Communication and Impact. Everyone knows that if you have a great idea or discovery but you can’t communicate it adequately, it is not going to have much impact!

Think of the doctor, Ignaz Philipp Semmelweis, who back in 1847 showed the connection between hand washing and patient survival rates. His observations conflicted with the established scientific and medical opinions of the time and his ideas were rejected by the medical community. He failed to communicate his discovery well and ended up in a mental institution for his troubles. It wasn’t until the 20th century that hygiene became standard practice in medicine.

SAS Press authors Roger Hoerl and Ronald Snee have put together an excellent course to enhance leadership skills and to help statisticians transition from being viewed as passive consultants to being viewed as proactive leaders within their organizations. SC2 From Statistical Consultant to Effective Leader explores how to communicate discovery and make an impact as a leader.

Come and visit the SAS Press booth to browse their recent title, Statistical Thinking: Improving Business Performance, and others in our collection of titles covering diverse topics from big data analytics, basic statistics, data reduction and management and text mining. Perhaps there is a topic you would like to see covered but we don’t have? Perhaps you have a topic you would love to write about?

While we do have some great titles, I know we haven’t covered everything. Please stop by and have a quick chat with me. I’d be happy to discuss what we do have available, what we would like to see and what you would like to see. After all, writing a book is a great way to start making an impact as a leader!

Post a Comment

Are you doing things SAS backwards?

SAS Programming Professionals,

Are you doing things SAS backwards?

Yea; we have all been there.  A manager, or a user, or a client, asks you to perform an analysis or create a data set and gives you a sketchy description of what they want. Maybe they say "It's like that report you did last November", or they send a few descriptive lines in an email, or they verbally describe it in a meeting, or they sketch out some ideas on a notepad.  So, you dutifully write a program that processes what you think is the relevant data and creates the result set you believe they want. You QC the results, present it to the requester and... you are told that it is not right.  So, you modify your program according to their feedback and create the next result set.  And, it is still not right.  You find yourself repeating the process again and again until your manager or user or client is finally satisfied.

What is wrong with this process is that you are starting to write a SAS program when you do not have solid specifications.  Not understanding exactly what the requester wants invariably leads to wasted programming time.  So, you need to get your user to be specific about what, exactly, s/he wants before you start programming.  At a minimum, the requester should define:


  • The data sources you are to extract the data from
  • The selection criteria you are to use to subset the data
  • The variables they want in the final data set or report


  • What types of analysis they want performed on the data
  • The formulas for any new variables that are to be created
  • Any transformations that are to be made to existing variables
  • What to do with missing values for important variables
  • Whether any variables should be suppressed from the final result set


  • For data sets, the type of data set and the variable order
  • For reports, the layout of the report, including variable order, headings, titles, footnotes, and summary columns
  • For graphs, the type of graph, variables to be graphed, titles, legends, footnotes, and reference lines
  • The type of additional documentation needed, such as a PROC CONTENTS or frequencies of the delivery data set

The best thing you can do is to have your user send you the specifications in writing.  That performs two functions:  It makes the person stop and really think out what s/he wants, and it gives you a solid starting point.  From there, you can ask questions to fill in any of the details that were not covered in the original, written request.  After a few iterations, you should have a solid spec to work from.

If you only receive verbal directions or sketchy notes via email and are not likely to get anything more solid, you may have to write the specs yourself.  In that case, you would write down the details of the Input, Process, and Output steps that you believe your client wants and send them to the client.  Let your client review the specs and verify that they are what s/he wants, or suggest changes to your proposal that better define what is needed.  Once the email exchange culminates in fully ratified specs, you are good-to-go with starting the SAS programming.

Almost invariably, any programming effort--no matter how well spec-ed out--will require an iteration or two to satisfy a user's requirements.  However, you can drastically reduce the time you spend programming by nailing-down your user's needs as much as possible before you begin to write your SAS program.  Few things in life are more satisfying than creating a tight, elegant, efficient SAS program that conforms to specifications and that produces the requested result set.

Remember; without specifications to refer to, you cannot judge whether or not the program you wrote and the result sets you created conform to the requirements.  And, as Lewis Carol wrote:  "If you do not know where you are going, any road will lead you there."

Be sure to check out Michael Raithel's latest book How to Become a Top SAS Programmer.

Post a Comment

Five tips from Simulating Data with SAS

simulatingdataData simulation is a fundamental technique in statistical programming and research. My book Simulating Data with SAS is an accessible how-to book that describes the most useful algorithms and the best programming techniques for efficient data simulation in SAS. Here are five lessons you can learn by reading it:

  • Learn strategies for designing a simulation study. What is the best what to vary parameters in a simulation study? A poorly designed study can result in many wasted hours of computation. This book provides general advice and illustrates it with specific programs.
  • Learn to use elementary probability distributions to simulate from more complex distributions. SAS software provides an easy way to simulate data from about 30 standard probability distributions, but what if you need a more complicated model? The book shows how to use the standard distributions to simulate data from complex distributions.
  • Learn how to simulate data that follow a regression model. The book shows you how to simulate response variables for general linear regression models, generalized linear models, mixed models, longitudinal models, time series models, and more.
  • Learn efficient techniques that save time. Yes, simulation is a brute-force statistical method, but efficiency matters. The book describes tips and techniques that can decrease the computational time a hundredfold. If your simulation currently requires hours, you might be able to rewrite it to run in seconds.
  • Learn to use SAS/IML software for simulations that involve multivariate data. Using the right tool makes completing a task faster and simpler. Although it is possible to turn screws with a pair of pliers, that tool is not optimal for that task. Similarly, although the DATA step is a powerful tool for some simulation tasks, the SAS/IML matrix language is a better tool for complex simulations and for those that involve multivariate data. I introduce the SAS/IML language gradually and show you how to complete many tasks by using either tool.

Simulation is a powerful statistical technique. Whether you are a novice or an expert, I hope that you will find many tips in Simulating Data with SAS that enable you to write simulations that are both effective and efficient.

Post a Comment

SAS author's tip: Identifying FIRST.row and LAST.row in by-groups

A popular DATA step programming technique frequently used by SAS users is the ability to identify the beginning and ending observation in by-groups. The way it works is whenever a BY statement is specified with a SET statement, two temporary variables are created in the Program Data Vector (PDV), FIRST.variable and LAST.variable, for each variable specified in a BY statement. These temporary variables contain a value of either 0 or 1, as shown below:

  • variable = 0 indicates an observation is not the first observation in a BY group
  • variable = 1 indicates an observation is the first observation in a BY group
  • variable = 0 indicates an observation is not the last observation in a BY group
  • variable = 1 indicates an observation is the last observation in a BY group

Although FIRST.variable and LAST.variable temporary variables are not automatically written to a SAS data set, once identified they can be used for special processing, reporting purposes, the creation of new variables, modification of existing variables, the structural change of a SAS data set, and more.

To illustrate this stalwart DATA step technique using the SQL procedure, a single table will be used, as shown below. The Movies table consists of six columns: Title, Category, Studio, and Rating are defined as character while Length and Year are defined as numeric.



As a longtime SAS user, I wanted a programming technique that could be used by PROC SQL users to emulate the DATA step’s FIRST., LAST., and By-group capabilities. Another important consideration was to have the technique operate similarly across operating systems and, if possible, with other vendor’s RDBMS SQL. The resulting technique uses an SQL subquery to identify the FIRST.row and LAST.row in each By-group.

Identifying FIRST.row in by-groups

The FIRST.row technique, shown below, is constructed using an SQL subquery to identify the first, or beginning, row in each by-group. The process begins by producing a new table that contains the desired by-group order, physically sorting the rows in the MOVIES table in ascending order by the primary variable, RATING, and then in ascending order by the secondary variable, TITLE. Note: The outer query serves as a control group to ensure that processing takes place in the desired by-group order. Next, the MOVIES table is processed to evaluate the results of the MIN function specified in the inner most query (or subquery). The MIN function’s purpose is to capture the smallest (or minimum) value (for numeric variables), or minimum letter (for character variables) and use this value in the WHERE clause comparison. The previous operations are then repeated for each unique by-group in the MOVIES table.

/** ROUTINE.....: FIRST-BY-GROUP-ROW                                    **/
/** PURPOSE.....: Identify the first (min) row in each by-group using a **/
/**               subquery.                                             **/
proc sql;
  create table first_bygroup_row as
    select rating,
           'FirstRow' as ByGroup
      from movies M1
        where title =
               (select MIN(title)
                  from movies M2
                    where M1.rating = M2.rating)
          order by rating, title;

FIRST.row Results


Identifying LAST.row in by-groups

The LAST.row technique, shown below, is constructed using an SQL subquery, with its purpose to identify the last, or ending, row in each by-group. The process begins by producing a new table that contains the desired by-groups, physically sorting the rows in the MOVIES table in ascending order by the primary variable, RATING, and then in ascending order by the secondary variable, TITLE. Note: The outer query serves as a control group to ensure that processing takes place in the desired by-group order. Next, the MOVIES table is processed to evaluate the results of the MAX function specified in the inner most query (or subquery). The MAX function’s purpose is to capture the maximum value (for numeric variables), or maximum letter (for character variables) and use this value in the WHERE clause comparison. The previous operations are then repeated for each unique by-group in the MOVIES table.

/** ROUTINE.....: LAST-BY-GROUP-ROW                                    **/
/** PURPOSE.....: Identify the last (max) row in each by-group using a **/
/**               subquery.                                            **/
  create table last_bygroup_row as
    select rating,
           'LastRow' as ByGroup
      from movies M1
        where title =
               (select MAX(title)
                  from movies M2
                    where M1.rating = M2.rating)
          order by rating, title;

LAST.row Results


Further Considerations

The collating sequence for character variables under ASCII (UNIX, Windows, and OpenVMS) and EBCDIC (z/OS) are different. The default ASCII collating sequence sorts special characters and digits first, followed by uppercase letters, and finally lowercase letters. In contrast, the default EBCDIC collating sequence sorts special characters first, followed by lowercase letters, then uppercase letters, and digits last. Due to the encoding differences, users are encouraged to consult additional references on the topic of sort order and collating sequences.

Users may also want to consider using multi-threaded sorting for reducing the real time to completion for sort operations. New with SAS System 9, the SAS System option, THREADS, takes advantage of multi-threaded sorting which may be able to reduce CPU resources and costs.

For more information about the SQL procedure, check out Kirk Paul Lafler’s book, PROC SQL: Beyond the Basics Using SAS, Second Edition.

Post a Comment

Resolve to improve your SAS programming skills

Happy New Year to you, your families, and your colleagues!

This is the time of year when people make all types of New Year's resolutions. According to the USA.Gov web site, the most popular resolutions are:

  1. Lose Weight
  2. Volunteer to Help Others
  3. Quit Smoking
  4. Get a Better Education
  5. Get a Better Job
  6. Save Money
  7. Get Fit
  8. Eat Healthy Food
  9. Manage Stress
  10. Manage Debt
  11. Take a Trip
  12. Reduce, Reuse, and Recycle
  13. Drink Less Alcohol

And, these resolutions all make sense because they promote personal growth. It is a new year, so why not start making the changes that you want to carry forward throughout the year?

Along with personal changes, perhaps it is also time to consider what changes you could make to improve your SAS programming skills. There is no "most popular list of SAS resolutions" for us to reference, so I will suggest some:

  1. Lose unnecessary variables and observations when inputting data sets
  2. Volunteer to help junior SAS programmers
  3. Quit writing programs without comments in them
  4. Get a better understanding of the intricacies of PROC SQL
  5. Get more sophisticated in your SAS Macro programming
  6. Save time by re-purposing your old programs
  7. Purchase some interesting SAS Press books... and read them
  8. Join a local SAS users group and attend meetings
  9. Get better at using the Output Delivery System
  10. Take a SAS class
  11. Write a technical paper for SAS Global Forum
  12. Reduce processing time by writing more efficient programs
  13. Subscribe to the SAS communities discussion groups on

Hardly a comprehensive list, but definitely a good start, no? And, I would bet that one or more of these resolutions resonate with you; that they are things you have been wanting to do, but have not quite found the time to work on in 2014. Well, there are 360 more days of 2015 just waiting for you to dig into your own SAS resolutions.

Of course, you could simply sit back and do the same old, same old. But, how could you continue to achieve personal growth as a SAS programming professional if you did that?

Meet the new year; same as the old year--Oh, we really think not!

Post a Comment

5 best-selling SAS Press books of 2014


SAS Press warehouse

When it comes to exchanging gifts at Christmas, my brother and I have a tradition of buying each other books. Even when he lived away in Chicago, and everyone else in the world was buying eBooks, I could always count on a big, heavy box delivered to my house each year filled with new authors and titles he thought I’d enjoy.

If you’re buying for a book lover, who also uses SAS, our list of the best-selling books this year might be a helpful shopping guide.

Top 5 Books

  1. The Little SAS Book: A Primer, Fifth Edition
  2. Big Data Analytics: Turning Big Data into Big Money
  3. SAS Certification Prep Guide: Base Programming for SAS 9, Third Edition
  4. SAS For Dummies, Second Edition
  5. Learning SAS by Example: A Programmer’s Guide

We also have a holiday spectacular sale going on right now with 75 percent off select titles.

The best part about giving someone a book is hunting for that special one that you think they’ll enjoy. I stumbled upon the perfect book for my brother months ago and it’s something he’ll never guess. I can’t wait until he opens it!

Do you have a similar book tradition with family or friends? Please share in the comments section.

Post a Comment