Last year, 50 companies came knocking on the doors of Rotman School of Management in Canada to recruit Masters of Business Administration (MBA) grads with SAS skills. This year, Rotman partnered with SAS to offer SAS Programming to interested candidates.
Last Sunday, I had the amazing opportunity to teach SAS Programming to 60 students at the university. The country’s future leaders, some of the brightest minds, gathered together on this cold Sunday morning. Even skipping out on the Santa Claus parade to learn SAS. You’re probably wondering, “Why on Earth did you pick Sunday to teach?” It was to work with their timing, given their super busy schedules. I experienced great commitment from all of the 60 students that showed up. They had many questions and were completely engaged.
A big question for them was SAS behavior when they tried to group sorted data. Take a look below:
1. We sorted the sales dataset BY country (in default ascending order) and within that BY Salary in descending order (using the descending keyword guarantees the order)
19162 libname orion 'c:\workshop'; NOTE: Libref ORION was successfully assigned as follows: Engine: V9 Physical Name: c:\workshop 19163 proc sort data=orion.sales 19164 out=work.sales; 19165 by Country descending Salary; 19166 run; NOTE: There were 165 observations read from the data set ORION.SALES. NOTE: The data set WORK.SALES has 165 observations and 9 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.06 seconds cpu time 0.01 seconds
2. Quick peek at the sorted dataset
That looks about right. Data is in order BY country (primary sort key) and within that BY descending salary.
3. Students then submitted this code to group data BY salary
19167 proc print data=work.sales noobs; 19168 by salary; 19169 run; ERROR: Data set WORK.SALES is not sorted in ascending sequence. The current BY group has Salary = 108255 and the next BY group has Salary = 87975. NOTE: The SAS System stopped processing this step because of errors. NOTE: There were 2 observations read from the data set WORK.SALES. NOTE: PROCEDURE PRINT used (Total process time): real time 0.33 seconds cpu time 0.01 seconds
But why did SAS complain?
Because SAS assumes that you are telling the truth. SAS looks at the BY statement in your PROC PRINT. Any time it sees a BY statement in either a procedure (PROC) or a DATA step, the assumption SAS makes is that the data is sorted by that variable.
“Haven’t I sorted the data by salary,” asked one of the students.
Yes you have, but it’s not the primary sort order. You have to respect that. See the data again. When SAS sees your PROC PRINT step with a BY salary, it expects your data to be sorted BY Salary in ascending order. But the data was not in that order, the previous PROC SORT you specified sorted the data by Country and within that BY descending salary.
That was the mismatch for SAS. On one hand it wants to believe you. That’s why it tries to group the data by salary. But when SAS looks at the data, just like you did in the data grid, it sees that the first row Salary value is 108255. And the next row salary has a value of 87975.
This is when it says, “Wait a minute.” The sales data is not in ascending sorted order BY salary. And so it sighs, throws up its hands and comes to a complete stop.
This is why it’s really critical that you know your data before you work with it. Try a PROC CONTENTS before to check the sort order. And then at the very least at least try to respect the primary sort order key. What that means is that if you sorted your data BY country, gender, and descending salary, try to repeat this order in your BY statement. Or at the very least try to respect the primary sort order. You can safely get away with a BY statement order of just country. Or country and gender.
Hope this helps you understand SAS a little bit better and the integrity with which it operates.
The students also asked for some yoga in between to wake up their minds. We moved in unison to wake up mind and body. I’ll leave you with a super brain yoga tip.
Try it anytime you feel an energy slump and want to wake up a tired brain. This is an amazing technique, and if practiced regularly can get you in the flow experienced by top performing athletes. Take my word for it!!