SAS takes your word for it

2

Rotman_2Last year, 50 companies came knocking on the doors of Rotman School of Management in Canada to recruit Masters of Business Administration (MBA) grads with SAS skills. This year, Rotman partnered with SAS to offer SAS Programming to interested candidates.

Last Sunday, I had the amazing opportunity to teach SAS Programming to 60 students at the university. The country’s future leaders, some of the brightest minds, gathered together on this cold Sunday morning. Even skipping out on the Santa Claus parade to learn SAS. You’re probably wondering, “Why on Earth did you pick Sunday to teach?” It was to work with their timing, given their super busy schedules.  I experienced great commitment from all of the 60 students that showed up. They had many questions and were completely engaged.

A big question for them was SAS behavior when they tried to group sorted data. Take a look below:

1. We sorted the sales dataset BY country (in default ascending order) and within that BY Salary in descending order (using the descending keyword guarantees the order)

19162  libname orion 'c:\workshop';
NOTE: Libref ORION was successfully assigned as follows:
      Engine:        V9
      Physical Name: c:\workshop
 
19163  proc sort data=orion.sales
19164             out=work.sales;
19165     by Country descending Salary;
19166  run;
 
NOTE: There were 165 observations read from the data set ORION.SALES.
NOTE: The data set WORK.SALES has 165 observations and 9 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.06 seconds
      cpu time            0.01 seconds

2. Quick peek at the sorted dataset

SASdata

That looks about right. Data is in order BY country (primary sort key) and within that BY descending salary.

3. Students then submitted this code to group data BY salary

19167  proc print data=work.sales noobs;
19168      by salary;
19169  run;
 
ERROR: Data set WORK.SALES is not sorted in ascending sequence. The current BY
       group has Salary = 108255 and the next BY group has Salary = 87975.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 2 observations read from the data set WORK.SALES.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.33 seconds
      cpu time            0.01 seconds

But why did SAS complain?

Because SAS assumes that you are telling the truth. SAS looks at the BY statement in your PROC PRINT. Any time it sees a BY statement in either a procedure (PROC) or a DATA step, the assumption SAS makes is that the data is sorted by that variable.

“Haven’t I sorted the data by salary,” asked one of the students.

Yes you have, but it’s not the primary sort order. You have to respect that. See the data again. When SAS sees your PROC PRINT step with a BY salary, it expects your data to be sorted BY Salary in ascending order. But the data was not in that order, the previous PROC SORT you specified sorted the data by Country and within that BY descending salary.

That was the mismatch for SAS. On one hand it wants to believe you. That’s why it tries to group the data by salary. But when SAS looks at the data, just like you did in the data grid, it sees that the first row Salary value is 108255. And the next row salary has a value of 87975.

This is when it says, “Wait a minute.” The sales data is not in ascending sorted order BY salary. And so it sighs, throws up its hands and comes to a complete stop.

This is why it’s really critical that you know your data before you work with it. Try a PROC CONTENTS before to check the sort order. And then at the very least at least try to respect the primary sort order key. What that means is that if you sorted your data BY country, gender, and descending salary, try to repeat this order in your BY statement. Or at the very least try to respect the primary sort order. You can safely get away with a BY statement order of just country. Or country and gender.

Hope this helps you understand SAS a little bit better and the integrity with which it operates.

The students also asked for some yoga in between to wake up their minds. We moved in unison to wake up mind and body. I’ll leave you with a super brain yoga tip.

Try it anytime you feel an energy slump and want to wake up a tired brain. This is an amazing technique, and if practiced regularly can get you in the flow experienced by top performing athletes. Take my word for it!!

Rotman_1

Share

About Author

Charu Shankar

Technical Training Specialist

Charu Shankar has been a Technical Training Specialist with SAS since 2007. She started as a programmer, and has taught computer languages, business and English Language skills. At SAS, Charu teaches the SAS language, SQL, SAS Enterprise guide and Business Intelligence. She interviews clients to recommend the right SAS training to help them meet their needs. She is helping build a center for special needs kids in this project. http://www.handicareintl.org/pankaja/pankaja.swf

Related Posts

2 Comments

  1. thanks Chris. for your valuable hint. I learned something new from your post.. that the database automatically passes the sort instruction. thanks so much.

  2. Chris Hemedinger
    Chris Hemedinger on

    Charu, this explanation is perfect for understanding how SAS deals with SAS data sets, which are stored in a specific sequence that affects the data processing. For those who access database tables (such as Oracle or even Microsoft Access), the SAS/ACCESS engines help you by delegating the sort work to the database, so you don't need a separate sort step. See an explanation in this post about how SAS works with databases to access data in the proper sort sequence.

Back to Top