I am part of a 'virtual team from SAS Technical Support' who provide the suggestions for the Tips & Techiques section for the SAS Tech Report each month. A couple of days ago, Bill Gibson, Chief Technology Officer from SAS Australia, sent us an email saying that like him, many seasoned SAS programmers concatenate values with the combination of || and TRIM and LEFT functions. He thought it would be a good idea to create a SAS Usage Note that spotlights the faster and more elegant concatenation approaches available in SAS 9.0 and beyond.
Prior to SAS 9.0, in order to concatenate text strings in SAS, a combination of the TRIM and LEFT functions and the concatenation operator (||) had to be used. A separator, if desired, would have to be added using a quoted string.
Beginning in SAS 9.0, the CAT family of functions was introduced. These functions simplify concatenating text strings by automatically using certain combinations of these functions and the concatenation operator. The new functions are listed below.
CAT: concatenates character strings without removing leading or trailing blanks
CATS: concatenates character strings and removes leading and trailing blanks
CATT: concatenates characters strings and removes trailing blanks
CATX: concatenates character strings, removes leading and trailing blanks, and inserts separators
The table below shows a comparison of each CAT function to using the concatenation operator.
Function Equivalent Code | |
---|---|
CAT(OF X1-X4) | X1||X2||X3||X4 |
CATS(OF X1-X4) | TRIM(LEFT(X1))||TRIM(LEFT(X2))||TRIM(LEFT(X3))||TRIM(LEFT(X4)) |
CATT(OF X1-X4) | TRIM(X1)||TRIM(X2)||TRIM(X3)||TRIM(X4) |
CATX(SP, OF X1-X4) | TRIM(LEFT(X1))||SP||TRIM(LEFT(X2))||SP||TRIM(LEFT(X3))||SP||TRIM(LEFT(X4)) |
To see an example of the new functions, run the following code:
data test; input (x1-x4) ($); x5=' 5'; length new1 $40 new2-new4 $10 ; new1=cat(of x1-x5); new2=cats(of x1-x5); new3=catt(x1,x2,x3,x4,x5); new4=catx(',', of x1-x5); keep new:; datalines; 1 2 3 4 5 6 . 8 ; run; proc print; run;
There are several differences in the CAT family of functions compared to the concatenation operator for concatenating character strings. If a variable has not been previously assigned a length, the variable created with the CAT function will be assigned a length of 200. Using the concatenation operator, the length assigned is the sum of the lengths of the values being concatenated. Another difference is in how missing values are treated. When using the CATX function with a separator, missing values are ignored. When using the concatenation operator and defining a separator, if any variable has a missing value, multiple separators will appear consecutively. The sample code below illustrates this.
data test; x1='1'; x2=' '; x3='6'; newx=catx(',', of x1-x3); newx1=trim(left(x1))||','||trim(left(x2))||','||trim(left(x3)); put newx=; put newx1=; run;
These code examples are provided as is, without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
6 Comments
Pingback: Construct the equation of a line: An exercise in string concatenation - The DO Loop
Thanks for the tips. Since I have been so used to using "||" operator, it is refreshing to get a
different perspective.
Don't forget CATQ. Plus all the functions have call routines if you're really interested in performance. This is my favorite paper on the topic:
http://support.sas.com/resources/papers/proceedings10/205-2010.pdf
Ajay,
Thanks for your comment and the link to the paper. I did not include the CATQ function or the equivalent CALL routines because my intent was to summarize what is in the documentation under "CAT functions". This part of the documentation addresses the more commonly used ones which are CAT, CATS, CATT, CATX.
Kathryn
I agree with Ajay .. at least regarding the catq function. I was surprised that the most recent documentation wasn't revised to compare it with the other members of the litter.
I will enter a suggestion for Publications to add CATQ to future documentation. Thanks for your feedback.