Tricks for importing text files in SAS Enterprise Guide

9

I'm a big fan of the Import Data task in SAS Enterprise Guide, especially for its support of text-based files (CSV, tab delimited, fixed width, and more). There's no faster method for generating SAS code that reads your data exactly the way you need it. I use the tool so often that I take for granted some of its neatest features, and I forget that many new users (and even veteran users) might not know about them. In this article, I'll review a few of the cool things that this task can do for you.

Read fixed-width text files into SAS

We think of CSV files (and...alas...Excel files) as the main standard for data exchange among systems, but many legacy systems still produce and consume fixed-width text data formats. The SAS DATA step is a perfect tool for reading these files, but defining the columns and their properties can be tedious. The "Fixed columns" option on the Import Data task can make this job simple.

Suppose that you're beginning with a spec like this:

And a raw data file like this:

You can use the Import Data wizard to define the boundaries of your columns by adding boundary lines with just click-and-drag operations. Beginning with the File->Import Data task, select your source text file and advance to the second page of the wizard. When you select "Fixed columns" as the input text format, you'll see a layout ruler that looks like this:

Click at the column boundaries (referring to your original spec!) and drag the rule lines as needed to define those column boundaries. Then click Next, and fill out details for the column names and types:

Which then tells the Import Data task how to generate the proper INPUT statements:

When you click Finish, you end up with a data set that's ready for business:

Modify the properties for multiple columns -- with one step

Here's a click-saving trick. Sometimes you have an input data file that contains many columns that share the same properties: type, length, and SAS format. It can be tedious to click and modify the properties of each column that you want to import. There's a shortcut on the Define Field Attributes page of the wizard that you can use to change the attributes for several columns at the same time. Simply SHIFT+Click to select multiple column definitions on the page, then click Modify.... The "Field Attributes for Multiple Selections" window appears, and you can change the necessary attributes just once and apply to the many items you picked.

This trick works as you import any text file or Excel file.

Create SAS program code that you can reuse anywhere

In a previous article I described how the Import Data task works "behind the scenes." Some of the magic that the task performs is not captured in SAS code, and that can present a challenge when you want to reuse this work in other settings -- for example, in a batch process or in a larger SAS program. However, with a couple of tweaks you can coerce the Import Data task into creating SAS code that you can almost just "lift and shift," as is.

The first option is hidden under the Performance window, labeled as "Bypass the data cleansing process." By default, the Import Data task reformats your input text file to normalize it for a cleaner import step. While doing no harm, most of the time this step isn't needed -- especially if your original data file is well formed. And since this step changes the input file, it's isn't repeatable outside of this task. My first tip for the best reusable code: click Performance... on the first page of the wizard, then select the "Bypass.." checkbox. That guarantees that the code will be formulated to read your original raw file. (Note that the Performance button is available only when importing text files, not Excel files.)

The second option you'll want to change is related to this, but you'll find it on the final page with the Advanced Options. Select "Generalize import step to run outside of SAS Enterprise Guide." This ensures that the task won't attempt any behind-the-scenes monkey business with your original file -- everything is captured in the DATA step that the task generates. Well, almost everything...

The one missing piece, a confounding factor when you select a local text file to import on a remote SAS Workspace session, is the transfer of the local file to the remote server. SAS Enterprise Guide copies the file for you -- behind the scenes -- and there is no SAS code to represent this step.

You can take control of even this step, though, if you make use of the Copy Files task (now available for you on the Tasks->Data menu). You can then copy the file from a local source folder, and land it wherever you want on the SAS server. Modify your newly repurposed Import Data code to pull from that server-based destination, giving you more control over the individual steps in the import process.

Learn more about importing text files

If you're new to importing data into SAS, whether using a SAS program or SAS Enterprise Guide, you might learn some of the basics from these video tutorials that were produced by SAS instructors:

Share

About Author

Chris Hemedinger

Senior Manager, SAS Online Communities

+Chris Hemedinger is the manager of SAS Online Communities. Since 1993, Chris has worked for SAS as an author, a software developer, an R&D manager and a consultant. Inexplicably, Chris is still coasting on the limited fame he earned as an author of SAS For Dummies.  He also hosts the SAS Tech Talk webcasts each year from SAS Global Forum, connecting viewers with smart people from SAS R&D and the impressive work that they do.

9 Comments

  1. This is great Chris, thank you!

    We literally began some projects this week where have an issue with a simple CSV file import; a libref that is assigned (verified libref properties are readOnly:no, Temporary:no) using the servers menu does not get recognized by the ImportData until after a base sas program is executed containing the libname does the ImportData produce a SAS datatset on the server directory assigned.

    The libname statement being used is copied from Management Console>Data Library Manager>lib-rightClick "Display Libname Statement" and it's about a simple as it gets...
    LIBNAME ImpactAP BASE "D:\Impact_APP";

    What am I missing here? Is there a option or config setting that I've overlooked?
    (Other that making the library assign on server connect via Options, but we don't need this for all projects)
    Does right clicking the library not assign the libname ?

    Or worse, am I expecting too much auto-magic?

    thanks!

  2. Excellent under the hood article and immediately useful for me. Yes, I often like to do a task in SAS Enterprise Guide first, then generalize to a SAS program often in fact in a SAS macro program loop. So often, I must import a data file at some point in my SAS program, so I make the import code in SAS Enterprise Guide. I am writing a SAS Global Forum paper about this type of step in SAS using SAS 9.2, 9.3, and 9.4 and going from 32 bit to 64 bit then mixing in Microsoft Office versions and Microsoft Windows versions. I had not thought of the changes in versions of SAS Enterprise Guide. Is the step with the "Generalize import step to run outside of SAS Enterprise Guide." check box, only in some more recent versions of SAS Enterprise Guide?

  3. Hi Chris,

    I have a question about SAS EG. Is there any way to redirect the log of SAS to a file when executing a SAS EG project?

    A second question..., about parameters:

    1) Can yo export a parameter defined in a SAS project (proj1) to another SAS project(proj2)

    2) Can you send a parameter or macrovariable of a SAS project to the LASR server of SAS VA?

    Thanks in advance

    • Chris Hemedinger
      Chris Hemedinger on

      Juan, You can use the Project Log feature in EG, and Export As a Step at the end of your flow. That should get you what you need as the record of a project run.

      Regarding parameters -- the only way to share a prompt value from one project to another would be to save the value from Project 1 (using VBS if automated, or DATA step with FILE statement to store macro var externally), and then read it in for Project 2. There is no built-in mechanism.

      And regarding sharing a value from an EG project to LASR -- I don't think so. But there is probably a way to achieve whatever it is you're trying to do. I suggest posting the question to the Visual Analytics board in SAS Support Communities.

  4. Thank you for the post! This is exactly what I need. I have another question related to this. I have 5 files that needs to be imported into Oracle. The first file has a header and the 5 files have the same data format. Is it possible to import all the 5 files at once and into the same table?

    • Chris Hemedinger
      Chris Hemedinger on

      Yes! The INFILE accepts filenames with wildcards, so if your files are all in the same folder, you can use:

      filename in '/path-to-your-folder';
       
      data want;
       /* file spec within your folder */
       infile in(*.csv);
      ...
      ;

Leave A Reply

Back to Top