The SELECT-WHEN statement in the SAS DATA step is an alternative to using a long sequence of IF-THEN/ELSE statements. Although logically equivalent to IF-THEN/ELSE statements, the SELECT-WHEN statement can be easier to read. This article discusses the two distinct ways to specify the SELECT-WHEN statement. You can use the first syntax when you are making logical decisions based on the possible values of a discrete variable. The second syntax enables you to use more complex logical expressions. In brief:
- Use the first syntax when the possible values of a single discrete variable are known in advance.
- Use the second syntax when the logic involves multiple variables or complex logic.
The first syntax for the SELECT-WHEN statement
A previous article shows the first syntax for the SELECT-WHEN statement and includes several examples. However, I will provide only a simple example here. The first syntax requires that you specify an expression on the SELECT statement. On the WHEN statement, you specify one of more values that the expression can take. Following the WHEN statement, you specify a statement that you want to perform when the WHEN expression is true. If you want to perform more than one statement, you can use a DO-END block to perform multiple operations.
It sounds more difficult than it is, so let's see an example. The following SAS data set contains dates in 2020. Some of them are well-known religious holidays, others are US holidays, and others are "personal holidays" that I like to celebrate with my friends, such as Leap Day, Pi Day, Cinco de Mayo, and Halloween. The data set contains the date of the holiday and a name. You can use a SELECT-WHEN statement to classify each holiday as Religious, USA, or Personal, as follows:
data Dates; length Holiday $11; input Holiday 1-12 Date date9.; format Date date9.; datalines; MLK 20JAN2020 Leap 29FEB2020 Pi 14MAR2020 StPatrick 17MAR2020 Easter 12APR2020 CincoDeMayo 05MAY2020 Memorial 25MAY2020 Labor 07SEP2020 Halloween 31OCT2020 Christmas 25DEC2020 ; data Holidays; length Type $12; set Dates; select(upcase(Holiday)); /* specify expression on the SELECT statement */ when('MLK','MEMORIAL','LABOR') Type="USA"; /* handle possible values of expression */ when('EASTER','CHRISTMAS') Type="Religious"; otherwise Type="Fun"; /* handle any value not previously handled */ end; run; proc print data=Holidays noobs; var Date Holiday Type; run; |
Notice that the SELECT statement branches on the expression UPCASE(Holiday). The new data set has a Type variable that classifies each holiday according to the value of UPCASE(Holiday).
In the SELECT-WHEN syntax, the OTHERWISE statement is optional but is recommended. If you do not have an OTHERWISE statement, and all the WHEN statements are false, the DATA step will stop with an error. You can place use a null statement after the OTHERWISE keyword to specify that the program should do nothing for unhandled values, as follows:
OTHERWISE ; /* semicolon ==> null statement ==> do nothing */
By the way, if you have only the dates, you can still form the Type variable by using the HOLIDAYNAME function in SAS. This is left as an exercise.
As stated previously, the SELECT-WHEN statement provides the same functionality as an IF-THEN/ELSE statement, but it is a simpler way to handle multiple values. For example, the equivalent IF-THEN/ELSE logic is as follows:
if upcase(Holiday)='MLK' | upcase(Holiday)='MEMORIAL' | upcase(Holiday)='LABOR' then Type="USA"; else if ... |
The second syntax for the SELECT-WHEN statement
The first SELECT-WHEN syntax enables you to perform an action for any value of an expression. Usually, you will branch on values of a discrete variable.
You can use the second SELECT-WHEN syntax when you need more complex logical processing. In the second syntax, you do not specify an expression on the SELECT statement. Instead, you specify a logical expression on each WHEN statement. The WHEN statements are executed in order until one is true. The corresponding action is then executed. If no WHEN statements are true, the OTHERWISE statement is processed.
The previous example uses the name of a holiday to assign values to the Type variable. Many holidays do not have fixed dates but rather dates that vary from year to year. For example, US Thanksgiving is the fourth Thursday of November, and MLK Day is the third Monday of January. Easter is even more complicated. Luckily, SAS provides the HOLIDAY function, which enables you to obtain the dates of common holidays for each year. For each date in the data set, you can use the HOLIDAY function to determine whether the date corresponds to a religious or USA holiday. The following SELECT-WHEN statement shows how to implement the second syntax:
data Want; length Type $12; set Dates; Year = year(Date); /* get the year for the date */ select; /* no expression on SELECT statement */ when(holiday('MLK', Year)=Date | /* logic to determine type of holiday */ holiday('Memorial', Year)=Date | holiday('USIndependence',Year)=Date | holiday('Labor', Year)=Date | mdy(9, 11, Year) =Date | /* Patriot Day 9/11 */ holiday('Veterans', Year)=Date | holiday('Thanksgiving', Year)=Date) Type="USA"; when(holiday('Easter', Year)=Date | holiday('Christmas', Year)=Date ) Type="Religious"; otherwise Type="Fun"; end; run; proc print data=Want noobs; var Date Holiday Type; run; |
This syntax for the SELECT-WHEN statement looks even more like an IF-THEN/ELSE statement.
Consider PROC FORMAT for recoding variables
On discussion forums, I see a lot of questions like this example in which programmers are creating a new variable from the values of an existing variable. In simple situations, you can use PROC FORMAT to perform this task. For more complicated situations, you can even define your own function to use as a format.
Summary
This article shows two ways to specify the SELECT-WHEN statement in SAS. The SELECT-WHEN statement is an alternative to using multiple IF-THEN/ELSE statements. For simple logical processing, you can specify an expression on the SELECT statement and specify values for the expression on the WHEN statements. For more complicated logic, you can put all the logic into the WHEN statements.
2 Comments
Hi Rick
I am using similar concept to one on you first code to create two variables with select - when statement but I am receiving an error which I am unable to decipher. Here is the log page
166 /* Employment Status derived from variable EMPLOY1*/
167 length emp1l emp1l1 $50;
168 select (EMPLOY1);
169 when (1, 2) emp1l = 1; emp1l1 = "1. Employed";
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
170 when (3, 4, 8) emp1l = 2; emp1l1 = "2. Unemployed";
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
171 when (5) emp1l = 3; emp1l1 = "3. Homemaker";
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
172 when (6) emp1l = 4; emp1l1 = "4. Student";
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
173 when (7) emp1l = 5; emp1l1 = "5. Retired";
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
174 otherwise emp1l = .;
175 end;
185 /* Race/Ethnicity derived from variable _RacePRV (Updated 2022 to _RacePR1*/
186 length _RacePRV1 _RacePRV2 $50;
187 select (_RacePRV);
188 when (1) _RacePRV1 = 1; _RacePRV2 = "1. White, NH";
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
189 when (8) _RacePRV1 = 2; _RacePRV2 = "2. Hispanic";
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
190 when (2) _RacePRV1 = 3; _RacePRV2 = "3. Black, NH";
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
191 when (4) _RacePRV1 = 4; _RacePRV2 = "4. Asian, NH";
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
192 when (7) _RacePRV1 = 5; _RacePRV2 = "5. Multi NH";
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
193 when (3) _RacePRV1 = 6; _RacePRV2 = "6. AIAN NH";
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
194 when (5) _RacePRV1 = 7; _RacePRV2 = "7. NHOPI";
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
195 when (6) _RacePRV1 = 8; _RacePRV2 = "8. Other Race, NH";
ERROR: Expecting "WHEN", "OTHERWISE", or "END".
196 end;
197
You can ask programming questions at the SAS Support Communities. An expert there will probably suggest that you click on the link to the documentation in the first line of the article. The section "Example 2: Using DO Groups" shows how to process more than one statement for each WHEN condition.