In SAS, the INPUT and PUT functions are powerful functions that enable you to convert data from character type to numeric type and vice versa. They work by applying SAS formats or informats to data. You cannot fully understand the INPUT and PUT functions without understanding formats and informats in SAS. This article describes when and how to use the INPUT and PUT functions.
INPUT and PUT functions: What are the input and output types?
Many SAS conference papers (and blogs) have been written about the INPUT and PUT functions. Unfortunately, most of the presentations present the INPUT and PUT functions as "functions that convert character values to numeric values and vice versa." This is true: the functions can convert a string such as "1.23" into the number 1.23, and vice versa. But this description is not sufficient.
Furthermore, these discussions often focus on the types of the source and target variables. For example, they might present simple rules such as the following:
- For the INPUT function, the argument to the function must be a character value; the function can return either a numeric value or a character value.
- For the PUT function, the argument can be either a numeric value or a character value; the function always returns a character value.
These rules are correct, but not insightful. I think these rules miss the point. The purpose of these functions is to apply informats and formats. To understand these functions, remember that the INPUT function applies an informat, whereas the PUT function applies a format.
A review of SAS formats and informats
Formats and informats are the key to understanding the INPUT and PUT functions, so let's quickly review those SAS concepts. Briefly:
- An informat reads a text string and converts it to a data value that is easier to work with or analyze.
- A format converts a data value to a textual representation that is (hopefully!) easier to read and interpret.
A canonical example is a date. We humans like to see a date written as a text string, such as "21MAR2020" or "Mar 21, 2020." But SAS stores dates as days since 01JAN1960, so it stores the date for "21MAR2020" as the number 21995. We humans would have a hard time entering dates by typing integers, so SAS helps us out. An informat (such as DATE9.) enables the SAS DATA step to read the string "21MAR2020" from a text file or DATALINES block. The informat tells SAS to store that datum as the number 21995. A format (such as WORDDATE12.) tells SAS to display the date as a text string such as "Mar 21, 2020". Thus, informats are for reading data into SAS, whereas a format is used to control how data are displayed in tables and logs.
The INPUT and PUT statements are familiar to many DATA step programmers, but the following list summarizes how they work:
- The INPUT statement reads data into SAS. When you use the DATA step to read data, each datum starts as a textual representation in a file or in the DATALINES block. You use the INPUT statement to read the data into either a character or numeric variable. By default, the INPUT statement uses the BEST. informat to read a number and uses the $w. informats to read character data. Of course, you can override these defaults by specifying an informat on the INPUT statement.
- The PUT statement is used to output data. When you use the PUT statement to write to a file or to the SAS log, the raw data can be any type, but the file or log always contains text. By default, the PUT statement applies the BEST12. format to convert numbers to text. For outputting strings, the PUT statement doesn't need to convert the data, but for uniformity you can think of the PUT statement as applying the $w. format to strings.
The INPUT and PUT functions have the same purpose and behaviors as the statements of the same name.
Examples of the INPUT function
Because the INPUT function applies an informat, it is obvious that the argument to the INPUT function must be a character value. (It is also obvious that the result of the INPUT function will be either numeric or character, depending on the informat that you apply.) For example, the text string "1,234" can be converted to the number 1.23 by applying the COMMAw. informat. Or the string "01JAN1960" can be converted to a SAS date (which is a number) by applying the DATEw. informat. The following DATA step shows examples of using the INPUT function to convert text to numbers:
data Input_examples; /* The CHAR array contains input values; the NUM array contains output values */ array char[3] $9 ("1,234", "9.87E-4", "21MAR2020"); array num[3]; num[1] = input(char[1], COMMA9.); num[2] = input(char[2], BEST9.); num[3] = input(char[3], DATE9.); run; proc print noobs; run; |
It is possible, but less common, to apply a character informat. One useful application is to standardize text values by applying the $UPCASE.w informat. For example, you might wish to prevent coding errors by ensuring that political affiliation is stored in uppercase values such as "DEM', "GREEN", "LIB", and "REPUB".
Examples of the PUT function
Because the PUT function applies a format, it is obvious that the argument to the PUT function can be any data type, but the result will be a character value. Applying a format always results in text! For example, the number 67.8 can be converted to the text string "$67.80" by applying the DOLLAR8. format. Or the SAS data value 21995 can be converted to the text string "Mar 21, 2020" by applying the WORDDATE12. format. The following DATA step shows examples of using the PUT function to convert data to character strings:
data Put_examples; array num[2] (67.8, 21995); /* Note: 21995 = '21MAR2020'd */ array char[2] $12; char[1] = put(num[1], DOLLAR8.2); char[2] = put(num[2], WORDDATE12.); run; proc print noobs; run; |
You can also use the PUT function to apply a character format, such as $UPCASE., to a character value.
Summary
In summary, the INPUT and PUT functions are similar to the INPUT and PUT statements, respectively. To understand the function, first understand the corresponding statements:
- When you read data into SAS, you use the INPUT statement. The INPUT statement reads a string and converts it into a number or a string by applying an informat (possibly the default informat).
- When you output text to a file or to the SAS log, you use the PUT statement. For some data, especially dates and times, it is useful to apply a format.
The INPUT and PUT functions are the functional equivalents of the INPUT and PUT statements. You can use the functions in conjunction with IF-THEN/ELSE statements or other logical statements. A common application of the INPUT function is to convert a text string to a number (including dates and times). You can use the PUT function to create a new variable that recodes an existing variable, although I strongly suggest using a FORMAT statement to recode values.
Further reading
For more examples and a discussion of how to handle invalid data, see "How to convert a character value to numeric in SAS" in the SAS Communities Library.
2 Comments
G'Day Rick,
Reading this blog post reminded me of a mnemonic SAS blog post I wrote a while ago (11 years ago!) on INPUT and PUT... https://blogs.sas.com/content/sastraining/2013/02/26/rhymes-mnemonics-and-tips-in-learning-sas/
Hope you are keeping well!
Cheers,
Michelle
Pingback: Run-time variations of the INPUT and PUT functions in SAS - The DO Loop