Concatenation with speed and style

I am part of a 'virtual team from SAS Technical Support' who provide the suggestions for the Tips & Techiques section for the SAS Tech Report each month. A couple of days ago, Bill Gibson, Chief Technology Officer from SAS Australia, sent us an email saying that like him, many seasoned SAS programmers concatenate values with the combination of || and TRIM and LEFT functions. He thought it would be a good idea to create a SAS Usage Note that spotlights the faster and more elegant concatenation approaches available in SAS 9.0 and beyond.

Prior to SAS 9.0, in order to concatenate text strings in SAS, a combination of the TRIM and LEFT functions and the concatenation operator (||) had to be used. A separator, if desired, would have to be added using a quoted string.

Beginning in SAS 9.0, the CAT family of functions was introduced. These functions simplify concatenating text strings by automatically using certain combinations of these functions and the concatenation operator. The new functions are listed below.

CAT: concatenates character strings without removing leading or trailing blanks

CATS: concatenates character strings and removes leading and trailing blanks

CATT: concatenates characters strings and removes trailing blanks

CATX: concatenates character strings, removes leading and trailing blanks, and inserts separators

The table below shows a comparison of each CAT function to using the concatenation operator.

Function Equivalent Code
CAT(OF X1-X4) X1||X2||X3||X4
CATS(OF X1-X4) TRIM(LEFT(X1))||TRIM(LEFT(X2))||TRIM(LEFT(X3))||TRIM(LEFT(X4))
CATT(OF X1-X4) TRIM(X1)||TRIM(X2)||TRIM(X3)||TRIM(X4)
CATX(SP, OF X1-X4) TRIM(LEFT(X1))||SP||TRIM(LEFT(X2))||SP||TRIM(LEFT(X3))||SP||TRIM(LEFT(X4))

To see an example of the new functions, run the following code:

data test;
  input (x1-x4) ($);
  x5=' 5';
  length new1 $40 new2-new4 $10 ;
  new1=cat(of x1-x5);
  new2=cats(of x1-x5);
  new3=catt(x1,x2,x3,x4,x5);
  new4=catx(',', of x1-x5);
  keep new:;
datalines;
1 2 3 4
5 6 . 8
;
run;

proc print;
run;

There are several differences in the CAT family of functions compared to the concatenation operator for concatenating character strings. If a variable has not been previously assigned a length, the variable created with the CAT function will be assigned a length of 200. Using the concatenation operator, the length assigned is the sum of the lengths of the values being concatenated. Another difference is in how missing values are treated. When using the CATX function with a separator, missing values are ignored. When using the concatenation operator and defining a separator, if any variable has a missing value, multiple separators will appear consecutively. The sample code below illustrates this.
data test;
x1='1';
x2=' ';
x3='6';
newx=catx(',', of x1-x3);
newx1=trim(left(x1))||','||trim(left(x2))||','||trim(left(x3));
put newx=;
put newx1=;
run;


These code examples are provided as is, without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.

tags: tips & techniques

5 Comments

  1. Posted May 30, 2012 at 10:10 am | Permalink

    Don't forget CATQ. Plus all the functions have call routines if you're really interested in performance. This is my favorite paper on the topic:
    http://support.sas.com/resources/papers/proceedings10/205-2010.pdf

    • Kathryn McLawhorn Kathryn McLawhorn
      Posted May 30, 2012 at 11:18 am | Permalink

      Ajay,
      Thanks for your comment and the link to the paper. I did not include the CATQ function or the equivalent CALL routines because my intent was to summarize what is in the documentation under "CAT functions". This part of the documentation addresses the more commonly used ones which are CAT, CATS, CATT, CATX.

      Kathryn

    • Art Tabachnek Arthur Tabachneck
      Posted June 1, 2012 at 9:47 am | Permalink

      I agree with Ajay .. at least regarding the catq function. I was surprised that the most recent documentation wasn't revised to compare it with the other members of the litter.

      • Kathryn McLawhorn Kathryn McLawhorn
        Posted June 4, 2012 at 7:45 am | Permalink

        I will enter a suggestion for Publications to add CATQ to future documentation. Thanks for your feedback.

  2. Posted May 30, 2012 at 10:24 am | Permalink

    Thanks for the tips. Since I have been so used to using "||" operator, it is refreshing to get a
    different perspective.

One Trackback

  1. [...] a similar function in the SAS DATA step. In the DATA step you can use the '||' operator or the "CAT" series of functions to perform string concatenation. To test the GetEqnOfLine function, I wrote loops that iterate [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <p> <pre lang="" line="" escaped=""> <q cite=""> <strike> <strong>