Using variable names with special characters

3

When we wrote the sixth edition of The Little SAS Book: A Primer, one of our goals was to write it for all SAS programmers regardless of which interface they use for coding: SAS Studio, SAS Enterprise Guide, the SAS windowing environment, or batch. This is harder than it sounds. There are differences in how the SAS interfaces work. One of those differences is the default setting for the system option VALIDVARNAME=.

VALIDVARNAME= System Option 

The system option VALIDVARNAME= controls which set of rules are used for variable names. If VALIDVARNAME= is set to V7, then SAS variable names must start with a letter or underscore, and cannot contain any special characters including spaces. If VALIDVARNAME= is set to ANY, then variable names may contain special characters including spaces, and may start with any character. Either way, variable names must still be 32 or fewer characters long. To find the default value for your SAS session, submit the following and read the SAS log:

PROC OPTIONS OPTION = VALIDVARNAME;
RUN;

To set the rules for naming variables for your current SAS session, use the OPTIONS statement

OPTIONS VALIDVARNAME = value;

where value is V7 for traditional SAS naming rules, or ANY for the more liberal rules.

Name Literals 

If you are using ANY rules, then you must use the name literal form for variable names that contain spaces or special characters in your programs. Simply enclose the name in quotation marks followed by the letter N:

'variable-name'N

Example 

The following tab-delimited file contains information about camping equipment: the item name, country of origin, the online price, and the store price. Notice that some of the column headings contain spaces or special characters.

The following program sets VALIDVARNAME= equal to ANY and reads the file using PROC IMPORT. Then in a DATA step, it uses the name literal form of the variable names to subset the data using an IF statement, and it creates a new variable that is the difference between the store and online prices.

*Read data using ANY rules for variable names;
OPTIONS VALIDVARNAME = ANY;
PROC IMPORT DATAFILE = 'c:\MyRawData\CampEquip.txt'  
     OUT = campequipment_any REPLACE;
RUN;
DATA campequipment_any;
   SET campequipment_any;
   IF 'Country of Origin'N = 'USA';
   PriceDiff = 'Store$'N - 'Online$'N;
RUN;

Here is the data set CAMPEQUIPMENT_ANY. Notice the special characters and spaces in the variable names.

Item Country of Origin Online$ Store$ PriceDiff
1 8 Person Cabin Tent USA 399 399 0
2 Camp Bag USA 119 129 10
3 Ultra-light Pad USA 69 74 5

If you decide that you don’t want to use name literals, then you could choose to rename the variables so that the names conform to V7 rules. You can do this using a RENAME data set option.

Another option is to use V7 naming rules when you create the data set. If V7 rules are in place, then PROC IMPORT will convert spaces and special characters in headings to underscores when creating variable names. The following program is like the first one except with VALIDVARNAME= set to V7. Notice how now, instead of spaces and special characters, the variable names contain underscores and the name literal form of the variable name is not needed.

*Read data using V7 rules for variable names;
OPTIONS VALIDVARNAME = V7;
PROC IMPORT DATAFILE = 'c:\LSB6\Data\CampEquip.txt'  
     OUT = CampEquipment_V7 REPLACE;
RUN;
DATA CampEquipment_V7;
   SET CampEquipment_V7;
   IF Country_of_Origin = 'USA';
   PriceDiff = Store_ - Online_;
RUN;

Here is the data set CAMPEQUIPMENT_V7.

Item Country_of_Origin Online_ Store_ PriceDiff
1 8 Person Cabin Tent USA 399 399 0
2 Camp Bag USA 119 129 10
3 Ultra-light Pad USA 69 74 5

If you are reading data files (either through PROC IMPORT or the XLSX LIBNAME engine) that contain headings that include spaces or special characters, we recommend that you always specify the VALIDVARNAME= rules that you want to use in an OPTIONS statement. That way your programs will always run no matter what the default value is for VALIDVARNAME= on your system.

To find more helpful programming tips or to get started with SAS, check out our latest edition, The Little SAS Book: A Primer, Sixth Edition.

 

Share

About Author

Lora D. Delwiche

Author

Lora D. Delwiche (right) enjoys teaching people about SAS software and likes solving challenging problems using SAS. She has spent most of her career at the University of California, Davis, using SAS in support of teaching and research. She is also a one of the authors of Exercises and Projects for The Little SAS Book Fifth Edition.

3 Comments

  1. Got it

    %macro special_char(unicode_greek=, name=);
    %global &name;
    data _null_;
    A=input("&unicode_greek."x,$UCS2B4.);
    call symput("&name.",trim(left(A)));
    stop;
    run;
    %special_char(unicode=03BC,name=mu);
    %mend;

    %macro special_char(unicode_greek=, name=);
    %global &name;
    data _null_;
    A = input("&unicode_greek."x,$UCS2B4.);
    B = 'x'||trim(left(A)) ;
    call symput("&name.",B);
    stop;
    run;
    %mend
    %special_char(unicode_greek=0305,name=xbar);

Leave A Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to Top