Using Chinese characters as labels on SAS Maps

8

Doing business in a global economy, have you ever found yourself wanting to show Chinese (or Korean, or Japanese) labels on a map? If so, then this blog is for you!

Before we get started, here is a photo of some Chinese characters to get you into the mood. This is a photo my dragon boating friend David took, while he was visiting China.

And how do you get Chinese characters like that in your SAS graphs and maps? There's probably more than one way, but I'm going to focus on the technique described in tech support sample 56546.

With the new GFK geographical maps which we ship as part of SAS/Graph, each map dataset also has a similarly-named attribute dataset containing the text names of all the map areas. For example, China's map dataset is mapsgfk.china, and the attribute dataset is mapsgfk.china_attr. In the attr dataset, we have the province code (id1), the province name in English (id1name), and the province name in Unicode escape representation (id1nameu) which represents the Chinese characters - they don't look like Chinese characters yet, but we'll get there shortly!

Those funky-looking characters and numbers in the id1nameu variable are are the Unicode escape (UESC) representation of the Chinese characters. We can use the SAS $UESCw. informat to read this character string, and convert it to the Chinese characters, using code like: text=input(id1nameu, $uesc500.);

One important detail is that you must be running a SAS session with UTF-8 encoding. I didn't want to change this permanently in my SAS installation (by modifying my sas9v.cfg file), therefore I used the command line -config option to temporarily run SAS using the 'nls\u8\sasv9.cfg' file. I'm an old-school SAS user, so I run my SAS jobs in batch mode from the DOS command line (betcha didn't know that about me!), and instead of typing in the long command line every time, I wrote a little DOS batch job to run it:

That might have seemed like a lot of technical details, but there are really only 2 important things to remember:

  • Run a utf8 SAS session.
  • Use the $UESCw. informat to process the Unicode escaped characters.

You've now got the Chinese characters in a text string, via something like text=input(id1nameu, $uesc500.), and you can use them just about any way you would use an English text string. Here's my map of China, with annotated province labels in both English and Chinese characters:

I invite you to click the above map to view the interactive version, with html mouse-over text and drill downs. Note that the Chinese characters even show up in the html mouse-over text in the web browser (I've tried the mouse-over text successfully using both Google Chrome and Internet Explorer on my Windows PC).

Note that you can also see the Chinese characters in the text string when you Proc Print the table, or view it using the table viewer in DMS SAS. Here's a screen-capture showing both of those:

 

And what good is an example if you can't re-use it with other data, eh? To demonstrate that this code and technique is flexible, I recycled my China code and created a South Korea map (using mapsgfk.south_korea and south_korea_attr). Click the image below to see the interactive version with mouse-over text and drill downs.

And here's one last "pro tip" ... If you're creating graphs & maps with labels in a language that you're not an expert/native speaker in, try to find someone who is an expert/native speaker to give it a sanity-check. While the characters might look correct to your untrained eye, a native speaker might notice subtle problems in the characters that could make a not-so-subtle difference in the meaning. Here's another picture from my friend David's China trip - are these really 'Friend Tomatoes' or are they possibly 'Fried Tomatoes'? ... I guess they'll be yummy either way!

 

I hope you enjoyed this blog post, and learned a few new tricks - especially if you're an old dog! Note that I'm not an expert in this area (actually, this is the first time I've tried it), so I might not be able to answer many questions on this topic ... but I'm happy to share my code with you, which should be a pretty good starting place for you to experiment with these techniques. Here are links to my code: China and Korea.

 

Share

About Author

Robert Allison

The Graph Guy!

Robert has worked at SAS for over 25 years, and is perhaps the foremost expert in creating custom graphs using SAS/GRAPH. His educational background is in Computer Science, and he holds a BS, MS, and PhD from NC State University. He is the author of several conference papers, has won a few graphic competitions, and has written a book (SAS/GRAPH: Beyond the Basics).

8 Comments

  1. Great tip! I always wondered what the UTF8 thing was in the right click batch submit menu. Old dogs CAN learn new things. Now the question is, does this extend to Klingon or Elvish characters? ;)

  2. Thanks Robert. Awesome post. With this Unicode column, one can present in any language!!!
    ...and thank you to the NLS group for making this possible.

Leave A Reply

Back to Top