Have you ever peered intently into an unfamiliar data delivery directory, realized what was in it, rolled over onto your side, stared blankly into the distance, and dejectedly uttered something akin to:
"Spreadsheets! Why did it have to be spreadsheets?"
If so, then we are definitely on the same page. Why does it always have to be spreadsheets?
The answer to that question is actually pretty obvious when you think about it. The popularity of Microsoft Office has made Excel one of the most popular mediums for storing data. It is used extensively in grade schools, middle schools, high schools, and colleges. People with home businesses use it; office administrators use it; clerical staff use it; scientists use it; lawyers use it; hospital workers use it; Federal, state and local government workers use it; and programmers use it too.
But, processing data stored in spreadsheets is not really a problem for intrepid SAS programmers. When I go on a data exploration expedition where there is a good chance of encountering spreadsheets, I pack the usual: my brown leather jacket, fedora, and bullwhip. But, most importantly, I put SAS/Access Interface to PC Files into my backpack.
SAS/Access Interface to PC Files is a SAS for Windows product that allows you to read, write, and update data in Excel and Access. As such, it is a must-have for your Windows SAS installation.
Here is an example of a program that I use to map out the contents of an unexplored spreadsheet:
ods rtf file="G:\BigProject\Worksheets in NewDataSpreadsheet.rtf"; libname xlslib "G:\NewProject\DeliveryDirectory\NewDataSpreadheet.xlsx" access=readonly; proc sql; create table WorkSheets as select distinct(compress(MEMNAME,"',$")) as WorkSheet_Name, name as ColumnName from dictionary.columns where libname = 'XLSLIB'; quit; proc print noobs data=WorkSheets; var WorkSheet_Name ColumnName; title1 "Workseets in NewDataSpreadsheet.xlsx"; run; ods rtf close;
The ODS statement specifies that my report will be created as an RTF document. Because I have SAS/Access Interface to PC Files, the LIBNAME statement allocates the NewDataSpreadheet.xlsx spreadsheet much the same way as it would for a SAS data set. (Notice that I specified access=readonly so that I do not accidentally update the spreadsheet). Since I have "LIBNAME-d" the spreadsheet, information about its worksheets and column names is now available in the SAS Dictionary Tables.
I use PROC SQL to extract the name of each worksheet (variableWorkSheet_Name) in the Excel file; and the names of the columns (variableColumnName) within each worksheet and then plop them into a SAS data set for further exploration. The code snippet compress(MEMNAME,"',$") gets rid of the annoying quotes and dollar signs that are found in spreadsheet MEMNAMEs. Then, I use the PRINT procedure to create a report. A simple, neat, quick, and easily macro-tized piece of code.
Here are several good references that you can use to find out more about processing spreadsheets with SAS:
- SAS Access product information on the SAS website
- Exchanging Data between SAS and Microsoft Excel: Tips and Techniques to Transfer and Manage Data More Efficiently, by Bill Benjamin
- Conference papers by Vince DelGobbo such as: Creating Multi-Sheet Microsoft Excel Workbooks with SAS®: The Basics and Beyond Part 1
Armed with those resources, some pluck, a sense of adventure, and with your own trusty copy of SAS/Access Interface to PC Files, you too can be a raider of the lost spreadsheet!
Best of luck in all your SAS endeavors!