SAS author's tip: General items to watch for when transferring data

0

This week's SAS tip is from Carol Matthews and Brian Shilling's book Validating Clinical Trial Data Reporting with SAS. Written for SAS programmers, this engaging guide contains many hands-on tips--including the featured excerpt below.

The following excerpt is from SAS Press authors Carol Matthews and Brian Shilling's book "Validating Clinical Trial Data Reporting with SAS" Copyright © 2008, SAS Institute Inc., Cary, North Carolina, USA. ALL RIGHTS RESERVED. (please note that results may vary depending on your version of SAS software)

5.4 General Items to Watch For When Transferring Data

As you gain more experience with the import and export of data, you will learn what types of issues to look for when checking the data resulting from any import or export processing you perform. Issues surrounding some of the more common data types encountered with data transfers are discussed in more detail later. The following are a few of the more common issues you will need to consider, regardless of file type and whether data is being imported or exported:

  • Date issues—make sure all dates are imported or exported correctly. The importance of this issue depends on the file type you are working with, but in general you need to make sure that this type of data is handled correctly.
  • Data types—the default number of rows SAS reviews when deciding what type of data is in a column is only 20 rows. When importing data from a file type other than SAS, if you let SAS decide on the data type stored in a column, make sure that all the data is of the same type (i.e., no mixed data types). If you see missing values in a numeric variable, make sure the data was truly missing and not a character value that got lost during the import process. Consider using the GUESSINGROWS option on PROC IMPORT so that SAS will check a larger number of values to identify the type. When exporting data, make sure the data type meets any specifications that you are working from (the SAS data sets may have stored the data as a numeric variable, but you may need to export that same data as a character field).
  • Truncation—as a default, SAS reviews only 20 rows when deciding the length needed for a character variable. Make sure you review text values to make sure no values are truncated (find the longest value in the imported file for each text variable and check those records in your new data). If you specify variable widths when exporting data, make sure those widths are wide enough to accommodate the largest text value in the data.
  • File content—if you are “manually” importing or exporting data (via a program rather than a wizard), make sure all of the variables from the original file are included in the new one. Also, if you import or export the data more than once with the same program, make sure the structure of the original data hasn’t changed (variables weren’t added or removed that your program doesn’t account for).

Regardless of file type or whether data is being imported or exported, the validation process should always contain two basic goals:

  1. Ensure that the resulting data accurately represents the original data.
  2. Ensure that the resulting data matches the specifications provided. How those two goals are accomplished will vary depending on the file type and whether the data is being imported or exported.

In all cases, it is important to document the process followed and the results of all validation efforts.

Share

About Author

Shelly Goodin

Social Media Specialist, SAS Publications

Shelly Goodin is SAS Publications' social media marketer and the editor of "SAS Publishing News". She’s worked in the publishing industry for over thirteen years, including seven years at SAS, and enjoys creating opportunities for fans of SAS and JMP software to get to know SAS Publications' many offerings and authors.

Comments are closed.

Back to Top