I'm a big fan of the Import Data task in SAS Enterprise Guide, especially for its support of text-based files (CSV, tab delimited, fixed width, and more). There's no faster method for generating SAS code that reads your data exactly the way you need it. I use the tool so often that I take for granted some of its neatest features, and I forget that many new users (and even veteran users) might not know about them. In this article, I'll review a few of the cool things that this task can do for you.
(Update 23Mar2020) I've recorded a video that shows how to use the Import Data task as a start, and then adapt the code that it generates for reuse in other environments, with other files, and to read multiple text files in a single step. Watch the video to learn more, or scroll down to read my step-by-step tips.
Read fixed-width text files into SAS
We think of CSV files (and...alas...Excel files) as the main standard for data exchange among systems, but many legacy systems still produce and consume fixed-width text data formats. The SAS DATA step is a perfect tool for reading these files, but defining the columns and their properties can be tedious. The "Fixed columns" option on the Import Data task can make this job simple.
Suppose that you're beginning with a spec like this:
And a raw data file like this:
You can use the Import Data wizard to define the boundaries of your columns by adding boundary lines with just click-and-drag operations. Beginning with the File->Import Data task, select your source text file and advance to the second page of the wizard. When you select "Fixed columns" as the input text format, you'll see a layout ruler that looks like this:
Click at the column boundaries (referring to your original spec!) and drag the rule lines as needed to define those column boundaries. Then click Next, and fill out details for the column names and types:
Which then tells the Import Data task how to generate the proper INPUT statements:
When you click Finish, you end up with a data set that's ready for business:
Modify the properties for multiple columns -- with one step
Here's a click-saving trick. Sometimes you have an input data file that contains many columns that share the same properties: type, length, and SAS format. It can be tedious to click and modify the properties of each column that you want to import. There's a shortcut on the Define Field Attributes page of the wizard that you can use to change the attributes for several columns at the same time. Simply SHIFT+Click to select multiple column definitions on the page, then click Modify.... The "Field Attributes for Multiple Selections" window appears, and you can change the necessary attributes just once and apply to the many items you picked.
This trick works as you import any text file or Excel file.
Create SAS program code that you can reuse anywhere
In a previous article I described how the Import Data task works "behind the scenes." Some of the magic that the task performs is not captured in SAS code, and that can present a challenge when you want to reuse this work in other settings -- for example, in a batch process or in a larger SAS program. However, with a couple of tweaks you can coerce the Import Data task into creating SAS code that you can almost just "lift and shift," as is.
The first option is hidden under the Performance window, labeled as "Bypass the data cleansing process." By default, the Import Data task reformats your input text file to normalize it for a cleaner import step. While doing no harm, most of the time this step isn't needed -- especially if your original data file is well formed. And since this step changes the input file, it's isn't repeatable outside of this task. My first tip for the best reusable code: click Performance... on the first page of the wizard, then select the "Bypass.." checkbox. That guarantees that the code will be formulated to read your original raw file. (Note that the Performance button is available only when importing text files, not Excel files.)
The second option you'll want to change is related to this, but you'll find it on the final page with the Advanced Options. Select "Generalize import step to run outside of SAS Enterprise Guide." This ensures that the task won't attempt any behind-the-scenes monkey business with your original file -- everything is captured in the DATA step that the task generates. Well, almost everything...
The one missing piece, a confounding factor when you select a local text file to import on a remote SAS Workspace session, is the transfer of the local file to the remote server. SAS Enterprise Guide copies the file for you -- behind the scenes -- and there is no SAS code to represent this step.
You can take control of even this step, though, if you make use of the Copy Files task (now available for you on the Tasks->Data menu). You can then copy the file from a local source folder, and land it wherever you want on the SAS server. Modify your newly repurposed Import Data code to pull from that server-based destination, giving you more control over the individual steps in the import process.
Learn more about importing text files
If you're new to importing data into SAS, whether using a SAS program or SAS Enterprise Guide, you might learn some of the basics from these video tutorials that were produced by SAS instructors:
- Using the Import Data task in SAS Enterprise Guide. This one shows how to import Microsoft Excel files, but the basic steps are similar in the interface.
- Read CSV files with SAS Studio
- Reading raw data with the INPUT statement. (Documentation topic)