Run-time variations of the INPUT and PUT functions in SAS

0

The INPUT function and PUT function in SAS are used to apply informats and formats (respectively) to data. For both functions, you must know in advance which informat or format you want to apply.

For brevity, let's consider only applying a format. To use the PUT function, you must know the exact name of the format (or informat) at the time that you write the program. Programmers call this situation "knowing the details at compile time," which can be restrictive because sometimes the details are unknown until run time. Delaying a decision until run time makes it possible to write dynamic, data-driven code, in which the format depends on the data.

But what if you do not know the name of the format until run time? Fortunately, SAS has you covered! The next sections describe the PUTN and PUTC functions, which apply formats, and the INPUTN and INPUTC functions, which apply informats.

Run-time variations of the PUT and INPUT functions

SAS provides two variations of the PUT function that enable you to specify the names of a format at run time:

  • The PUTC function enables you to apply a format to character data. This is not often done, although the $UPCASEw. format can be used to standardize data by displaying categories in uppercase.
  • The PUTN function enables you to apply a format to numeric data. This is useful when the format depends on the data.

These functions take strings for the name of the format, which means that they can be assigned at run time. That is, whereas the PUT statement uses a hard-coded format name, such as
    t = put(1234, COMMA6.);
the PUTN statement requires a string argument, as follows:
    t = putn(1234, "COMMA6."); /* note quotes! */
This means that you can assign the name of the format to a variable, as follows:
    fmtname = "COMMA6."; /* assign name to a variable */
    t = putn(1234, fmtname);

Similarly, SAS provides two run-time variations of the INPUT function:

  • The INPUTC function enables you to apply a character informat to a text string. A character informat begins with a dollar sign ($). This function is not used often.
  • The INPUTN function enables you to apply a numeric format to a text string. This is very useful. For example, you can convert the string "21MAR2020" into a SAS date, which is numeric.

These functions take strings for the name of the informat, in contrast to the INPUT function, which requires a hard-coded name for the informat.

Applying a format in a data-driven way

The power of the PUTC and PUTN functions is that you can apply a format at run time. A simple example is when the data set itself contains the name of the format that you should apply to the data. Here is an example that formats a date value in several different ways. The name of the SAS format is not known until run time:

data putn_examples;
length fmtname $12 date_str $15; 
input d fmtname;
date_str = putn(d, fmtname);
datalines;
21995  DATE9.
21995  YYMMDD.
21995  DDMMYY8.
21995  WORDDATE12.
;
 
proc print; 
var d fmtname date_str;
run;

Each observation contains a different format name. The output shows that the date_str variable is assigned based on the name of the format.

Another application could be to look at the value of the data before choosing a format. For example, if a customer lives in Europe, you could format their birthday by using the YYMMDD8. format, whereas you could choose the DDMMYY8. format for US customers.

A data-driven construction of a format

Let's construct an example in which the value of the data determines the format, which is then applied by using the PUTN function. The following DATA step implements the following steps to format integer data:

  1. Count the number of digits in the integer.
  2. The COMMAw. format puts a comma after every three digits, so count how many commas are needed.
  3. Add those numbers to obtain the width of the string after applying the format.
  4. Use the CATS function to build the name of the format.
  5. Use the PUTN function to assign a string by applying a format to the integer.
/* to calculate the number of digits in an integer, see
   https://blogs.sas.com/content/iml/2015/08/31/digits-in-integer.html
*/
data putn_function(drop=w);
input n;
num_digits = ceil(log10(n+1));          /* n has this many digits */
num_commas = floor((num_digits-1) / 3); /* put a comma after every 3 digits */
w = num_digits + num_commas;
length fmtname $8;                      /* length of "COMMA31." */
fmtname = cats("COMMA", put(w,2.), ".");
str_n = putn(n, fmtname);
datalines;
99
1000
500000
1000000
50000000
100000000
1000000000
;
 
proc print; run;

The table shows the name of the format that is used for each integer in the data. It also shows the result of applying the format to the integer.

This example is somewhat silly and would not be used in practice. However, it shows how you can look at the value of data and make choices about the format that you want to apply. I say it is silly because the example uses the COMMAw. format where the width field (w) is chosen dynamically. In practice, you could simply use the COMMA32. format, which is the largest field width that is supported.

The PUT and INPUT function in PROC IML

Here is a bit of trivia: The SAS IML language does not support the PUT or INPUT functions. However, it does support the PUTN, PUTC, INPUTN, and INPUTC functions. The reason has to do with syntax. In the SAS IML language, functions expect numeric or character vectors for their arguments. When IML encounters the expression
    t = put(1234, COMMA6.);
it tries to find an IML symbol named "COMMA6." to see what value it has. Because that symbol does not exist, you get an error such as ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string,...

However, you can use the PUTN function instead. The IML statement
    t = putn(1234, "COMMA6.");
is clear, unambiguous, and performs the same computation.

Summary

The PUT and INPUT functions are functional equivalents of the PUT and INPUT statements. They require that you specify a literal name for a SAS format (or informat) at the time that you write the program. In contrast, the PUTN, PUTC, INPUTN, and INPUTC functions enable you to specify a format at run time. This means that you use values of the data to choose the best format (or informat) to apply. These functions are used in the IML language, which does not support the syntax for the PUT and INPUT functions.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Leave A Reply

Back to Top