I hope that the following statement is not too controversial...but here it goes: Microsoft Excel is not a database system.
That is, I know that people do use it as a database, but it's not an application that supports the rigor and discipline of managing data in the same way that "real" database systems do. Even Excel's more boring and less popular cousin, Microsoft Access, supports the discipline of database management much better than Excel.
Still, Excel spreadsheets are widely used because they are so flexible and can contain rich content, including charts and fancy formatting. These might be assets when it comes to creating compelling output, but it doesn't help a bit when your task is to analyze the data.
That's why the first step to analyzing spreadsheet data within SAS is to import the data into a SAS data set. The Import process allows you to impose the oft-needed discipline to the spreadsheet data "on the way in" to your SAS environment. For example, you can:
- Select the data sheet, or an exact range within a sheet, that you want to import
- Apply formatting rules to the data values so that they are treated appropriately (as character, numeric, date or time values, for example)
- Control the field names so that you have SAS-friendly variable names in the resulting data set
Done properly, these steps will reduce the amount of "clean up" you need to perform before you can proceed with analysis.
The Import Data task within SAS Enterprise Guide (menu: File->Import Data) can lead you through this easily. For well-organized data in spreadsheets, the default settings are usually acceptable and you can simply review them and click Finish with no tweaking. For spreadsheet data that are less organized, the level of control that the wizard offers can help a lot.
We often get questions about how the import process works within SAS Enterprise Guide. For example, does it require SAS/ACCESS to PC Files? Does it generate a SAS program that you can reuse in batch code or in a stored process? Here are the answers:
SAS Enterprise Guide does not require SAS/ACCESS to PC Files in order to import Microsoft Excel content. SAS Enterprise Guide uses local Windows components to read the data from spreadsheets and then transforms them into a text-based format that SAS can import via DATA step. (Note: the same is true when importing Microsoft Access databases, Lotus worksheets, or even Paradox files.)
This means that when the step is complete you have your data in SAS, but you don't have a SAS program that represents the entire process. SAS Enterprise Guide does some of the work behind the scenes. That's convenient if you don't have SAS/ACCESS to PC Files on your SAS server. It's less convenient if you are trying to build a standalone batch program.
SAS Enterprise Guide can use SAS/ACCESS to PC Files if you want. If you do have SAS/ACCESS to PC Files on your Windows SAS server, you can check a box in the Advanced Options of the Import Data task and get SAS Enterprise Guide to generate a working program that uses PROC IMPORT with DBMS=EXCEL. This is similar to how the Import Data wizard works in the SAS display manager environment. However, there are a few requirements:
- The Excel file must reside in the file system of the SAS server (your local PC if using local SAS, or the remote Windows system if your SAS workspace is on a different machine).
- This is supported only when connecting to SAS on Windows. It is possible to use SAS/ACCESS to PC Files on a Unix system, but it requires a PC Files Server and the DBMS=EXCELCS option, which the Import Data task does not generate for you.
If all of those planets align for you, then you can use the Import Data task to create a reusable SAS program.
Exporting SAS data in Excel format does not require or use SAS/ACCESS to PC Files. Again, SAS Enterprise Guide uses the Windows-based APIs to create XLS files (or XLSX files with this custom task), and PROC EXPORT is not used for this. If you need the step to happen inside a SAS program, you can write your own PROC EXPORT step in a program node.
I hope that this clears up some of the questions and misconceptions we encounter. But if this post doesn't do it for you, let me know in the comments.
27 Comments
I have the PC File server running with 9.2 and EG 4.3 but when I code the export using the driveletter:name.xls I get:
ERROR: Server Name is invalid or missing.
Great points. I hate dealing with pulling data from Excel. So much can go wrong, since spreadsheets have what appear to be "columns" but are really just a collection of cells sitting on top of each other (i.e. there are no column attributes, therefore there are no variables).
That said, for those of us who don't use EG, wonder if you could post on best practices for importing data from excel. There are so many different ways to do this (libname with excel engine, proc import, ODBC, save to csv then write input statements....) and each method has its own options to be learned (guessingrows etc).
I would love to find a good paper comparing/contrasting different options for reading from Excel.
The Advanced Options in Import Task is a new feature of E. Guide 4.305?
I can't find it on 4.22 or 4.3
Elizabeth, this sounds like a good topic to track with SAS Tech Support. Some ideas:
- when using DBMS=EXCELCS on Windows, there is a new Autostart feature that can be used so you don't need host/port information in your code.
- When accessing a network resource (such as a mapped drive) via a remote SAS workspace, you may need to use the UNC notation (\myserverfoldername.xls) instead of the mapped drive name.
- Also when accessing a network resource via a Windows SAS workspace, the server machine may need to be Trusted for Delegation, a setting that your Windows admin can adjust.
Chris
Quentin, thanks for the comment. That's a great idea that I'll attempt in a future post - a complete inventory of ways to use SAS to get content out of Excel and put it back. It's a long list!
Advanced Options is on the last page of the Import Data wizard. Options include:
- Embed the data within the SAS code (put the Excel content into DATALINES)
- Use SAS/ACCESS when possible (PROC IMPORT)
- Remove characters that can cause transmission errors (example: Japanese characters transmitted to a server running with LATIN1 encoding)
Oops, sorry...
I was looking for an Advanced BUTTON.
Thanks Chris, i'll be waiting for this great post.
By the way, i'm having problems using PC Files Server (Win64) to import this type of data:
Colors
992233
55FF22
110011
Because of the lack of a MIXED and GUESSINGROWS options in EXCELCS engine (importing xlsx format), i loose some data unless i manually modify column format in excel.
Maybe you can add an alternative in your future post.
Hi Chris,
Thanks for the info, very useful. I have a question, can one import pmml into SAS Enterprise Guide? I know this can be done on SAS Enterprise Miner.
Sakhi
No, SAS Enterprise Guide does not support PMML. PMML is a markup language for predictive modeling, which is the discipline supported by SAS Enterprise Miner. SAS Enterprise Guide is more of a generalist tool, although it does have integration points with SAS Enterprise Miner and SAS Forecast Server to support some specific disciplines.
I know this is an old post, but since it appears close to the top of Google searches in relation to SAS EG 5.1 and the SAS PC Files Server I thought I'd chime in here. I toyed around with the SAS PC Files Server to connect my local machine to a Base SAS windowed environment running on a remote UNIX machine and ultimately halted its use.
The SAS PC Files Server offers a solid way to import and export most data types from excel/access while also providing a number of connection options for excel/access. It should be noted that exporting either via a LIBNAME connection or export dbms=excelcs fails to capture metadata that will preserve numerical columns exceeding 15 digits in length. Excel will truncate these values, even if your data in SAS is saved as a character value. Lets not even get started on the fact that the SAS PC Files Server, the remote server and your local version of Microsoft Office (32 bit or 64 bit OS/Version) will adversely effect your files being readable upon export using the SAS PC Files Server connection.
HI I am trying to import an xlsb file into sas but it aint working. is there any way i can do it.
environment: SAS Grid unix. EG 6.1
SAS Enterprise Guide can import XLSB files with the File->Import Data task. This works within EG, but doesn't generate the PROC IMPORT code, but you can write your own code that uses DBMS=EXCEL or EXCELCS -- xlsb format is supported.
"you can write your own code that uses DBMS=EXCEL"
Great - how do you do that? I need to be able to import and export data to individual sheets in Excel and to multiple tables in MS_ACCESS and I have yet to discover any way to do that. Ideally in code, so I don't have an insane process flow diagram.
Thanks!
Neil,
My favorite way to import Excel is to by using PROC IMPORT and DBMS=XLSX. Also, PROC EXPORT and DBMS=XLSX supports the updating of individual sheets, without having to recreate the entire file.
MS Access is trickier. You can use PROC EXPORT to MS Access, but only (I think) if the bitness matches or by using PC Files Server (also look at LIBNAME PCFILES). Another method - if you're on Windows SAS -- try an ODBC connection to your MS Access database.
I want the excel to be automatically imported in SAS EG how to do that?
The easiest method to import is to use the Import Data task. If you want it to happen automatically when you open a particular SAS Enterprise Guide project, you can place that task step in a process flow that you name "Autoexec", and it will be run automatically.
After selecting Data file, I am not able to click on FINISH button. help please!!
Do you have an active SAS server? Selected an output location for the data set?
We are facing the possibility of loading a few hundred Excel files from local PC into SAS which runs on a server. Unfortunately the PC file Server was disabled in my environment. I wonder whether there is an effective way to do this job, preferable in coding. Thanks.
If your Excel files are local, I recommend using the Copy Files task (in the Data menu) to upload them into your SAS session. Then use PROC IMPORT with DBMS=XLSX (no PC Files Server needed) to read the content. You might even be able to use LIBNAME XLSX to read standard data sheets into data sets.
Where "exactly" does this whole article explain how I import multiple spreadsheets into a single program through SAS Enterprise Guide ?
This article doesn't cover the multiple spreadsheet scenario.
For multiple XLSX files, here's what I would suggest:
- Use the Copy Files task to copy the XLSX files from your local source to your SAS session (that task supports wildcards and macro variables).
- Use macro code to iterate over each XLSX file and use PROC IMPORT or LIBNAME XLSX to pull the data into SAS data sets.
Easier said than done, perhaps. But neither PROC IMPORT nor LIBNAME XLSX support filename wildcards natively.
Great article!
I have imported an excel spreadsheet via SAS EG but unfortunately the column names and formats do not match the existing SAS dataset that I would like to append it to. The spreadsheet has 50 columns. I know you can copy the name of columns and their formats to a clipboard. I was hoping there was a way in EG to paste the clip board into properties of a table in order to modify that tables columns and formats.
The only way around this, I'm afraid, is to use DATA step and/or PROC DATASETS with MODIFY to change the attributes of the data set you want to append to the original (base) -- or vice versa.
Chris,
what does the task that does upload of a SAS data set do behind the scenes?
Michael, the Upload Data Sets task? It works like this: