A lookup table is a programming technique where one or more values can be used to retrieve another value. For example, many years ago, I had benzene exposure estimates for 10 years (1940 to 1949) for each of five locations in a factory. Given a year and a job location, I needed to know the benzene concentration.
I would be terribly embarrassed today if anyone saw the first program I wrote to solve the problem! This blog shows a better way that uses temporary arrays to create an n-way lookup table. To keep the example simple, let's use five years of data (1944 to 1948) and four locations (1 to 4).
Before we get into the program, let's discuss temporary arrays, one of my favorite SAS tools. Here is an example of a one-dimensional temporary array:
Data Pass_Fail; input ID $ Grade1 - Grade5; array PF _temporary_ (65 70 55 65 55); array Grade; *If you leave off the variable list SAS will use the array name with numbers 1-5 added. In this example the variables will be Grade1, Grade2, etc.; array Pass_or_Fail $ 4; do i = 1 to 5; if Grade[i] ge PF[i] then Pass_or_Fail[i] = 'Pass'; else if not missing(Grade[i]) then Pass_or_Fail[i] = 'Fail'; end; drop i; datalines; 001 90 68 52 70 72 002 56 69 72 75 88 ; Title "Listing of Data Set Pass_Fail"; Proc print data=Pass_Fail noobs; Run;
In this example, the temporary array is called PF (pass fail values), and it has 5 elements. There are no actual variables PF1, PF2, and so on, only array elements PF, PF, and so on. The initial values of the five passing grades are placed in parentheses following the key word _temporary_. In many situations, you load the values of the temporary array from a data file.
To keep this first example easy to understand, we will put the initial values in the array statement. You can now compare each student's grade for every test and assign a value of "Pass" or "Fail."
Here is the output:
Note: You can read a blog that I wrote years ago on temporary arrays for another example.
Now for the two-way table lookup example.
*Two-dimensional table lookup using a temporary array; data Lookup; array Benzene[1944:1948,4] _temporary_; ① /* Populate the array */ if _n_ = 1 then do Year = 1944 to 1948; ② do Location = 1 to 4; input Benzene[Year,Location] @; ③ end; end; input Subj $ Year Location; Benzene_Level = Benzene[Year, Location]; ④ datalines; 250 200 150 130 90 180 155 90 95 35 170 140 80 50 45 100 40 50 25 15 001 1944 3 002 1948 1 003 1945 4 ; title "Listing od Data Set Lookup"; proc print data=Lookup noobs; run;
① This ARRAY statement creates an array with two dimensions (you use a comma to create multiple dimensions). To make programming easier to understand, the first dimension of the array uses subscripts 1944 to 1948, rather than 1 to 5 (the colon enables you to specify the lower and upper bounds of an array). Also notice that there are no initial values in this statement—they will be read from data.
② This section of code populates the values in the Benzene temporary array. You use the statement if _n_ = 1 to ensure that this section of code executes only once.
③ The INPUT statement reads in a value for Year and Location. The single trailing @ sign prevents SAS from going to a new line each time to DO loop iterates.
④ Notice how easy it is to retrieve an exposure value, given a value of Year and Location. The first five lines of data are the values used to populate the temporary array.
You can read more about temporary arrays in my book, Learning SAS by Example: A Programmers Guide, Second Edition.
Comments on this blog are welcome.