Manfred Kiefer is a Globalization Specialist for SAS and the author of SAS Encoding: Understanding the Details. This week's tip is from his new book. In a review, Edwin Hart said "This book provides a very readable description of a topic that has long needed exposure: Why do my characters get garbled on the computer, and how do I fix the problem?" Kiefer's featured tip provides a good start.
The following excerpt is from SAS Press author Manfred Kiefer and his book "SAS Encoding: Understanding the Details" Copyright © 2012, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED. (please note that results may vary depending on your version of SAS software).
Encoding of External Files
The FILE, FILENAME, and INFILE statements support the ENCODING= option, which enables users to dynamically change the encoding for processing external data.
SAS reads and writes external files using the current session encoding. This means that the system assumes the external file is in the same encoding as the session encoding. For example, if your session encoding is UTF-8 and you are creating a new SAS data set by reading an external file, SAS assumes that the file’s encoding is also UTF-8. If it is not, the data could be written to the new SAS data set incorrectly unless you specify an appropriate ENCODING option. Here is an example:
filename in 'external-file' encoding='Shift-JIS'; data mylib.contacts; infile in; length name $ 30 first $ 30 street $ 60 zip $ 10 city $ 30; input name first street zip city; run;
This code makes sure that the data of the external file is transcoded from Shift-JIS to UTF-8.
Likewise, when you write data to an external file, the data is written out in session encoding unless you specify an appropriate ENCODING= option. In the following example, we first create a subset of our contacts data, and then write the output to an external file with an encoding of Shift-JIS:
/* Create a subset with Japanese data from the main table */ proc sql; create table japan as select * from mylib.contacts where country_e = 'Japan'; quit;
/* Write the output to an external file with an appropriate encoding */ filename out 'external-file' encoding='Shift-JIS' data _null_; set WORK.japan; file out; put @1 name @31 first @62 street @133 zip @144 city; run;
When an external file contains a mix of character and binary data, you must use the KCVT function to convert individual fields from the file encoding to the session encoding. See the SAS National Language Support (NLS): Reference Guide for details on using the KCVT function.