Using LIBNAME XLSX to read and write Excel files

86

When you weren't watching, SAS did it again. We smuggled Yet Another Excel Engine into a SAS release.

SAS 9.4 Maintenance 2 added the XLSX engine, which allows you to read and write Microsoft Excel files as if they were data sets in a library. The big advantage of using this engine is that it accesses the XLSX file directly, and doesn't use the Microsoft data APIs as a go-between. (LIBNAME EXCEL and LIBNAME PCFILES rely on those Microsoft components.) That means that you can use this engine on Windows or Unix systems without having to worry about bitness (32-bit versus 64-bit) or setting up a separate PC Files Server process.

The XLSX engine does require a license for SAS/ACCESS to PC Files. Are you a SAS University Edition user? The SAS/ACCESS product is part of that package, so this technique works there. It's an easy way to get well-formed Excel data into your SAS process.

/* because Excel field names often have spaces */
options validvarname=any;
 
libname xl XLSX '/folders/myfolders/sas_tech_talks_15.xlsx';
 
/* discover member (DATA) names */
proc datasets lib=xl; quit;
 
libname xl CLEAR;

Example output:

xl_contents
Once the library is assigned, I can read the contents of a spreadsheet into a new SAS data set:

/* because Excel field names often have spaces */
options validvarname=any;
 
libname xl XLSX '/folders/myfolders/sas_tech_talks_15.xlsx';
 
/* read in one of the tables */
data confirmed;
  set xl.confirmed;
run;
 
libname xl CLEAR;

And here's the result in my SAS University Edition:

xl_confirmed
Sometimes you need just one value from a spreadsheet. That's a common use case for dynamic data exchange (DDE), which isn't as feasible as it once was. With the XLSX engine, you can use FIRSTOBS= and OBS= options to control how much data you retain:

/* read in just one value */
data _null_;
  set xl.confirmed (firstobs=6 obs=6 keep='Job title'n);
  call symput('VALUE','Job Title'n);
run;
%put &value;

Output:

 76         %put &value;
 Testing Manager,  Quality-driven User Experience Testing

You can also use the XLSX engine to create and update XLSX files.

libname xlout XLSX '/folders/myfolders/samples.xlsx';
 
data xlout.cars;
  set sashelp.cars;
run;
 
data xlout.classfit;
  set sashelp.classfit;
run;
 
data xlout.baseball;
  set sashelp.baseball;
run;
 
data xlout.air;
  set sashelp.air;
run;
 
libname xlout clear;

Here is my output in Microsoft Excel with all of these data sets now as sheets:

xl_xlsxout
Remember, you can also create Microsoft Excel files with Base SAS by using ODS EXCEL -- experimental in 9.4 Maintenance 2 but production in Maintenance 3 (coming soon).

The XLSX libname is different from the EXCEL and PCFILES engines in other ways. For example, the XLSX engine does not support Excel named ranges (which can surface a portion of a spreadsheet as a discrete table). Also, you won't see the familiar "$" decoration around the spreadsheet names when they are surfaced in the library within SAS. If you need that sort of flexibility, you can use PROC IMPORT to provide more control over exactly what Excel content is brought into SAS and how.

One other IMPORTANT caution: The XLSX engine is a sequential access engine in that it processes data one record after the other. The engine starts at the beginning of the file and continues in sequence to the end of the file. Some techniques to MODIFY the data in-place will not work. Also, some SAS data viewers cannot render the data from the XLSX engine. SAS VIEWTABLE and SAS Enterprise Guide and even SAS Studio can't open these tables directly in the data grid view. VIEWTABLE gives you a nice message, but SAS Enterprise Guide simply "hangs" in the attempt. For that reason, I recommend using DATA step to copy the Excel content that you want to another SAS library, then CLEAR the XLSX library to avoid accidentally opening a table in a viewer that won't support it. (This is currently a bug in SAS Enterprise Guide that should be fixed in a future release.)

I have found LIBNAME XLSX to be a quick, convenient method to bring in Excel data on any SAS platform. If you have SAS 9.4 Maintenance 2 or later, try it out! Let me know how it works for you by sharing a comment here.

Share

About Author

Chris Hemedinger

Senior Manager, SAS Online Communities

+Chris Hemedinger is the manager of SAS Online Communities. Since 1993, Chris has worked for SAS as an author, a software developer, an R&D manager and a consultant. Inexplicably, Chris is still coasting on the limited fame he earned as an author of SAS For Dummies.  He also hosts the SAS Tech Talk webcasts each year from SAS Global Forum, connecting viewers with smart people from SAS R&D and the impressive work that they do.

Related Posts

86 Comments

  1. Hi Chris,

    Would this new XLSX engine have the same old issues of having to guess which columns should be numeric, which should be character, and the length of the character field based on the first 20 rows? That seems to related to MS Jet engine, if like you said this has nothing to do with MS API, potentially it should not have that problem, right?

    Thanks

    Ya

    • Chris Hemedinger
      Chris Hemedinger on

      Ya,

      The XLSX method takes advantage of the Excel metadata that is encoded into the XLSX file format, so I think that there are fewer nuances with "guessing" column types based on values. However, there might be some other nuances and limitations. You'll have to test it with your content to see how it works for you. I think that it works best when your data are well-formed in the spreadsheet, database-like -- and not sparse mixed content like you might encounter in some Excel files.

    • Chris Hemedinger
      Chris Hemedinger on

      Matt, yes. In fact, that's the default behavior for the XLSX engine. One note: since sometimes Excel headers have spaces or special characters in the text, use OPTIONS VALIDVARNAME=ANY to tell SAS to preserve these.

  2. jaap karman on

    Nice this feature is there in 2015 after it was introduced by MS in 2007.
    That is leaving us with another question. As the xlsx file and the open-office format (OASIS eg libre office) are closely related (MS did some add-ons) is that format also supported?

    The other one is Excel is supporting an AES encryption on the whole zip (xlsx) file. Any option to specify that?

    • Chris Hemedinger
      Chris Hemedinger on

      Jaap, do I detect some sarcasm in your comment? :)

      As you know, the XLSX libname is not the first support for the XLSX format in SAS. XLSX support has been in SAS for several years, and this latest advance is another method that makes it more convenient to treat Excel content like data. I've been blogging about XLSX techniques since 2010.

      I don't think that password-protected Excel XLSX files are supported in SAS, yet. Are they widely used? We would look to our customers to tell us how important encrypted XLSX files are to them.

      And I'm not aware of any projects to support Libre Office specifically. Since Libre supports exchange with the Microsoft standard file formats, I'm not sure of the demand among our users. Again, we would look to our customers to provide that feedback. Adding ideas on http://communities.sas.com allows us and our users to review and "vote" as part of the SASWare Ballot.

      • jaap karman on

        You are right for the "some sarcasm".
        1/ Customers do not always tell the things they are really needing but instead telling things they are having some difficulties with. There is a difference that is doing root cause analyses. The feed-back with ideas it a good one but do not forget doing a root cause. I am remembering your "autoexec" EGuide post.
        http://blogs.sas.com/content/sasdummy/2011/01/10/you-asked-for-it-the-autoexec-process-flow/
        2/ The Excel (better MS office) question is having the same background. Some users are wanting that as primary interface, others have replacements not wanting to use MS. http://www.theguardian.com/technology/2014/jan/29/uk-government-plans-switch-to-open-source-from-microsoft-office-suite

        Personal I have the experience that an Excel is often getting the primary concern instead of the data-analytics area. The first and most difficult part is getting acceptance for those XLSX interface on the Unix area. They want to have it being delivered and accessed as easy as on their windows desktop. I think that would be no problem with a nice sftp browser associating the xlsx with Excel and processing it all server-side with SAS. Distributing by mail is another option. That are user functionality requests and not the same as IT-people requests for blocking that.

        The encryption will be a next step. To much to solve first with higher prio's. But at some moment there will be the functionality requirement of data having being encrypted. There is an aes option added with 9.4 for sas datasets. To agree I hope you see the same kind of conflict interest of the "users end-users" and "users ts-support and audit - decision makers" is there. To whom are you listening in those cases?

  3. The below program is not working in SAS University edition. Infile is accepting for .txt .dat .csv files except .xlsx. Can you explain the reason why?
    Data USpresidents;
    Infile "/folders/myshortcuts/Raw/Excel/USPRES.xlsx";
    Input PresidentName $ Party $ Number;
    Run;
    Proc print data=USpresidents;
    Run;
    Proc contents data=USpresidents;
    Run;

    • Chris Hemedinger
      Chris Hemedinger on

      Yes. You can't use DATA step to read an Excel file directly. You need to either import the file with PROC IMPORT, or use LIBNAME XLSX (as described in this post). Example:

      libname xl XLSX "/folders/myshortcuts/Raw/Excel/USPRES.xlsx";
      proc datasets lib=xl;
      quit;

  4. Hi,

    Just wondering what is the advantage to using this method as apposed to the PROC EXPORT function (allows .csv file export) :)

    I would expect that .csv files would allow for larger amounts of data due to the file type?

    Thanks

    B

    • Chris Hemedinger
      Chris Hemedinger on

      CSV does allow for large data and doesn't have the same restrictions as Excel, although XLSX does allow for some large spreadsheets. CSV doesn't have the same level of metadata encoded into the column types, but much of that can be inferred upon import.

      Sometimes, the requirement (from a client, from the business) is Excel output, and this helps to manage that.

  5. I noticed some unexpected behavior when the following steps are taken
    1. output a SAS dataset to an existing sheet sheet in an xlsx file using the basic method described in the post
    2. open the xlsx file and manually make changes to the data in the sheet from step 1
    3. create a new dataset from the sheet in step 2, treating it as a dataset as described in the blog post

    Reviewing the new data shows that only the records from step 1 are captured in the new dataset. Worth noting is that after step2, a .bak file is created that is a copy of the original excel file prior to being written to but deleting this has no impact on the results of performing step 3

    • Hi Chris,
      I have come back periodically over the last 5 months hoping that I would at least be told that my results can't be duplicated and therefore I must be doing something wrong. Are my steps hard to follow? I could write something more thorough if that would help. Do you have any idea what is going on? I have had several occasions when this would have been useful but I hesitated at implementing it in our process so as not to introduce unexpected and hard to trace bugs.
      Thanks,
      David

      • Chris Hemedinger
        Chris Hemedinger on

        Hi David, I'm not sure what's happening for you here. I'd suggest tracking with SAS Technical Support -- they can get you a definitive answer.

  6. Arie Oudshoorn on

    how do I specify in the 2nd example that my first row in excel contains the columnnames like the GETNAMES=yes option in proc import?

    • Chris Hemedinger
      Chris Hemedinger on

      I think that's the default behavior, at least from what I've seen. The first row is treated as column names. You might need to specify OPTIONS VALIDVARNAME=ANY in order for the names to be valid, if they contain spaces or special characters.

      • Arie Oudshoorn on

        thanks for the reply. See this:
        options validvarname=any;
        libname xl XLSX "my_excel.xlsx" ;

        data DATA_PWZ ; set xl.'DATA PWZ'n ; run;

        NOTE: The import data set has 446 observations and 91 variables.
        NOTE: There were 446 observations read from the data set XL.'DATA PWZ'n.
        NOTE: The data set WORK.DATA_PWZ has 446 observations and 91 variables.
        NOTE: DATA statement used (Total process time):
        real time 0.50 seconds

        Looks good, but a proc print shows that the columnnames are in first row :

        Obs A B C D

        1 BEGINDATUM BEGINTIJD EINDDATUM EINDTIJD
        2 41971 0.4305556 41971 0.43125
        3 41971 0.4131944 41971 0.4166667
        4 41971 0.4840278 41971 0.4847222
        5 41971 0.4604167 41971 0.4625
        6 41971 0.425 41971 0.4284722
        7 41971 0.4583333 41971 0.4604167
        8 41971 0.5236111 41971 0.5291667
        9 41971 0.4097222 41971 0.4104167
        10 41971 0.41875 41971 0.4201389

        gr, Arie

        • Chris Hemedinger
          Chris Hemedinger on

          I don't see that behavior when the column names are in the first row of the sheet. You might need to track this with SAS Technical Support, as the issue might be triggered by your particular data. Does this happen with all Excel sheets for you?

  7. Hi Chris,

    Good info. Is there any easy way to remove the spaces in the dataset names once brought in (if they exist)? In your example, the second dataset TOPIC IDEAS has spaces.

    Alan

    • Chris Hemedinger
      Chris Hemedinger on

      Alan,

      You can't automatically change the sheet names on the way in, but you could use PROC DATASETS to copy and rename the data members into another library. You could use the NVALID function to determine whether the name was valid, and then use COMPRESS to remove characters that don't comply with the SAS rules.

  8. Richard DeVenezia on

    Hi Chris:

    Is there a roadmap for XLSX engine to provide member level status to the named ranges, of either worksheet or workbook scope, or named tables ?

    • Chris Hemedinger
      Chris Hemedinger on

      Hi Richard -- good question. I'm not aware of a project for that, but it would be a good suggestion if you want to submit it to the SASWare Ballot. In the meantime you can accomplish it with PROC IMPORT.

  9. Hi Chris,
    I am having an interesting problem here, I followed the steps posted here with proc dataset but for some reason the individual sheet are not read as data member type
    /*because excel field names often have spaces*/
    options validvarname=any;
    libname xl XLSX 'ntummc\datashares\Dept\JHS\SC\Working\Stanford\Datareq\Jh1480\SNPs.xlsx';

    /*discover member (data) names */
    proc datasets lib = xl ; quit;

    libname xl clear;
    log message is WARNING: No matching members in directory.
    What is it that it cannot recognize the individuals sheets as member type?
    thanks
    Stanford

    • Chris Hemedinger
      Chris Hemedinger on

      Hi Stanford, I'm not sure why -- what are the member names (sheet names)? Have you tried a PROC IMPORT with DBMS=XLSX to see what that might reveal? You might need to add OPTIONS VALIDMEMNAME=EXTEND if the sheet names have special characters.

  10. Hi Chris,

    I am trying to determine if libname XLSX can reduce the need for a PCFIles Server in a WIndows Data / Linux SAS situation.

    The documentation for libname xlsx says "The XLSX engine enables you to read and write XLSX data directly across UNIX, Linux, and Microsoft Windows systems. If you have XLSX files stored
    on a UNIX or Linux system, you do not need to use the SAS PC Files Server (or the PCFILES LIBNAME engine) to access the Excel (.xlsx) files. If you have SAS running on UNIX or Linux and your XLSX files reside remotely on Windows, you need to use the PCFILES LIBNAME engine access them."

    However what if the files on the Windows Server are available on the Linux service via a Samba share? Will Libname XLSX work then?

    • Chris Hemedinger
      Chris Hemedinger on

      Joe,

      If the XLSX files are network-accessible from a file mount on UNIX, you should be able to work this without PC Files Server. This would be true for LIBNAME XLSX as well as the DBMS=XLSX option on PROC IMPORT and PROC EXPORT.

  11. romain miralles on

    Hi Crhis,

    Thanks for the article. I am trying to update a sheet using this method but Iget the error below. Do you know any solution?

    Can not replace sheet because it has existing formulas that reference other pages

    • Chris Hemedinger
      Chris Hemedinger on

      Romain,

      This is by design -- so that the export/update process doesn't overwrite a linked sheet and corrupt the file. You might try using LIBNAME PCFILES instead, but that would require more setup (with the PC Files Server) and I'm not sure it's possible even then. This is the sort of thing DDE is good at, but DDE relies on even more moving parts that are increasingly rare in a SAS environment (SAS and Excel together on a single Windows machine).

      • I get the same error "can not replace sheet because it has existing formulas that reference other pages" on some, but not all, of my Excel destination export sheets using the XLSX DBMS for proc export.

        It seems to affect random sheets, but consistently affect the same sheets. I've checked several times that the sheets do not have formulas embedded. Their cells are referenced from other sheets within the workbook, but this is true also for other destination sheets where the proc export executes without fail.

        Even if I replace the sheet the error will subsequently occur. Nothing apparently funny about the sheet names associated with this error either.

        I have 9.4 EG (64-bit) and Office 2010 (32-bit)

        • Chris Hemedinger
          Chris Hemedinger on

          I think the message is issued if an existing sheet has any reference to other content in the workbook (such as a link to another sheet). In the XLSX file, there is some metadata of sheet references called a "calculation chain", and if there is any reference from your target sheet, the XLSX method errs on the side of caution and won't overwrite. If you think this is NOT the case for your workbook, you should open a case with SAS Technical Support with the details.

          • Thanks, Chris. I think our company may have to open a ticket.

            What's weird about you described as the references problem is that I can "fix" it by doing the following. Let's say there are two sheets in the Excel book (A and B). The proc export writes to sheet B and sheet A has formulas with cell references to sheet B. No formulas on sheet B. If the proc export fails with this error (above), I can create a new sheet "Bx" and successfully use a proc export to sheet Bx. Then change the formulas on sheet A to reference cells on sheet B2 instead, then the proc export will continue to work on subsequent attempts.

            It can't be the existence of formulas pointing to an export sheet, but something as you say about Excel's underlying metadata for a particular sheet that comes into conflict with SAS in circumstances I have yet to pinpoint.

          • Fabio Fonseca on

            Hi Chris,

            I have a similar issue. Would you know if a pivot table built with a data source coming from a proc export in SAS is considered a reference? I'm not using any formula, but rather a pivot table. What's it is odd is that even when I remove all the pivot tables associated with the Excel table, the export just won't work.
            It basically seems quite random why it works or not, since sometimes it does, but I'm losing hope in finding the solution for this one. I need consistency to run the report automatically, can't just expect it works

  12. Pingback: Your favorite SAS posts from 2015 - SAS Voices

    • Chris Hemedinger
      Chris Hemedinger on

      Michelle,

      LIBNAME XLSX is a sequential engine, which has some limitations. From the doc:

      The XLSX engine allows the sequential reading of data only. That is, it does not support random access. It does not support certain tasks that require random access, such as the RANK procedure, which requires the reading of rows in a random order.

      I'm not sure which data set options are specific to random-access engines, but I guess the limitations are similar to those for the XML engines.

  13. I'm afraid I've run into the gotcha as was noted in the original post when trying to "view" the xlsx sheet while in EG. The first, I could find no way to interrupt the hung EG session except to End the process from Windows Task Manager. I was 95% confident the .EGP would be autosaved, so I re-entered EG and chose the Autosave to the .EGP's original location (on Linux btw). I was also prompted to save an autocall macro to it's original location, which I did. From there, I noticed the .EGP file disappeared from the folder directory as I viewed it on Linux. The entire .EGP was lost and unfortunately our IT is not doing backups of SAS-related files in this filesystem at a new data center.

    I tried to recapture the EGP from the location assigned as the Project Recovery area on Windows, but the only thing saved there, was the last copy of a program I had last executed, nothing else from the EGP.

    I have not been able to find reference to this known bug so any help like what is the safest way to interrupt EG at this point would be greatly appreciated. We are runnnig this version of EG:
    7.11 HF3 (7.100.1.2805) (64-bit)

    Regards,
    Marty Light

    • Chris Hemedinger
      Chris Hemedinger on

      Hi Marty,

      Sorry to hear that you lost work with this hangup. The Autosave location for EG is usually something like "%appdata%\SAS\EnterpriseGuide\7.1\ProjectRecovery" on you local machine, so be sure to check there for historical versions of you project file.

      I don't know of a way to interrupt EG safely, other than killing the process, when you get this particular hang. I know it's a defect that the EG developers have on their list to fix (I entered it, but I wasn't the only one). In the meantime, EG programmers need to be careful about direct access using this engine.

  14. tarun chitkara on

    How can i make other than 1th row as a variable row.Example-->I have an excel file where i want to make 4th row as a header row while importing because first 3 rows has redundant data. My data start from 6th row as 5th row also has redundant data.

    I knew the option as name row in proc import for xls file but what option to use for xlsx file while importing from unix and windows environment.

    • Chris Hemedinger
      Chris Hemedinger on

      You can use DBMS=XLSX in a PROC IMPORT statement, and you should be able to achieve the same effect with STARTROW and DATAROW options.

  15. Hi Chris,
    Does this approach allow to export variable labels instead of names?
    data xlout.CMMC2014;
    set work.hosp;
    If Hospital eq 'CMMC' & Match in ('E','Y');
    run;

    • Chris Hemedinger
      Chris Hemedinger on

      Stanford,

      I think an approach that might work for you is: use ODS EXCEL, and PROC PRINT with the LABEL option.

      ods excel file="c:\temp\CMMC2014.xlsx"; proc print data=work.hosp (where=(Hospital eq 'CMMC' AND Match in ('E','Y'))) label noobs; run; ods excel close;

      • Many thanks Chris, I've to say I have saved your page on my favorites and do visit frequently, you're doing a great job in helping the intermediate programmers such as myself,

  16. Thanks for this article. Super helpful. Quick question, do you know if it's possible to start writing data from the second row instead of the first? Thanks!

    • Chris Hemedinger
      Chris Hemedinger on

      Do you mean on the second row of the sheet -- or beginning with the second row of data? For the latter, use FIRSTOBS or similar:
      libname t xlsx "c:\temp\t.xlsx"; data t.a; set sashelp.class(firstobs=2); run; libname t clear;
      For the former -- no, I don't think you have that control with LIBNAME XLSX. Check out ODS EXCEL though -- it has lots of options for placement and formatting.

  17. When i try to use xlsx engine and create a xlsx file from a sas dataset it gets created and at the same time it creates define name which has the range of the sheet.

    It can bee seen in excel formulas in excel:
    Below is th code used to create xlsx file.

    libname xlout xlsx"c:\temp\define.xlsx" ;

    data xlout.air_help;
    set sashelp.air;
    run;

    Now when i try to modify the sheet manually created in the above step by adding the more rows and columns and save it and try to convert this modified xlsx back to SAS dataset by referring the to the libname assigned using xlsx engine i do not get the updated data as per the updated xlsx sheet which i modified after xlsx was created by sas using xlsx engine in the above step.

    data air;
    set xlout.air_help;
    run;

    It gives me 144 observations and 2 variables where as i added one more column in the sheet manually after it got created in first step.

    Ideally it should have shown 144 obs and 3 variables.

    Where as proc import works fine on the modified sheet.

    Is this a defect in xlsx engine.

    • Chris Hemedinger
      Chris Hemedinger on

      It might be a defect, or just a nuance. That's open to interpretation...

      The process of creating the sheet "air_help" creates an Excel named range with the same name, and that's what shows even after you make changes to the workbook. To workaround this: when you edit the workbook to add new rows/columns, be sure to also visit the Formulas->Name Manager menu. DELETE the named range ("air_help") that you see listed there. This affects only the range definition -- it won't affect the data. Save the workbook and then SAS will be able to see the full sheet again, not just that predefined range.

      • Edit a formula would do but if we do all this using proc import then formula does not need to be edited in the scenario stated. This makes me think it is a defect .and SAS user might not realize it .

  18. Hi Chris,
    I tried to combine multiple excels - It worked . but the combined excel does not have the formats(colour ,font) etc ? Do we have any way to preserve the existing formats?

    Thanks
    Teena.

    • Chris Hemedinger
      Chris Hemedinger on

      Not with LIBNAME XLSX, which is purely a data exchange (no formatting). For colors and formatted coming from SAS, you need to use ODS EXCEL as a destination. To merge with an existing Excel file, you might need to use LIBNAME PCFILES (which treats Excel more like a database) or else the SAS Add-In for Microsoft Office.

  19. When I use the proc datasets lib=[excel workbook libname] to give the list of sheets within the Excel workbook, sometimes it does not return any of the names, even though I have used the option validmemname=extend. Examples of the sheet names within the file are 'FRN 2853951 - IA' (quotes mine). In other examples such as 'test-123-456' (no spaces) show up just fine in the datasets statement. So I'm wondering if the spaces are the problem. Without the ability to do proc datasets statement I don't see how the Excel files can be opened in SAS... Any way around the not recognizing sheet names, do you know? Thanks.

    • Chris Hemedinger
      Chris Hemedinger on

      This should work, and it does in my test cases. Another approach you might try (assuming a libname of "xLIB"):

      proc sql;
      select * from sashelp.vmember where libname="XLIB";
      quit;

      If it's not working, I suggest reaching out to SAS Tech Support.

  20. When i try to read the xlsx file using proc import in SAS grid 9.4 M3 using dbms=xlsx i get the below error:

    ERROR: An exception has been encountered.
    Please contact technical support and provide them with the following traceback information:

    The SAS task name is [IMPORT (]
    Segmentation Violation

    Traceback of the Exception:

    ERROR: An exception has been encountered.

    But when i try to run proc import in PC sas 9.2 with dbms =excel option it runs fine without error.

    I am not able to figure out whats going on

    • Chris Hemedinger
      Chris Hemedinger on

      That's most likely a bug in the XLSX method for PROC IMPORT -- I recommend you contact SAS Technical Support to see if there is a fix available. Anything unusual about the file? Very large column names or other strange attributes?

    • Chris Hemedinger
      Chris Hemedinger on

      DBMS=XLSX is the best way if you can get it to work. Otherwise you could try LIBNAME XLSX. Work with SAS Tech Support if you continue to see errors.

  21. Hi Chris,

    It is possible to capture the order of the worksheets in the Excel file?

    Datasets/contents/dictionary tables all appear to list the worksheets in alphabetical order.

    Thanks,
    Daniel.

    • Chris Hemedinger
      Chris Hemedinger on

      Daniel, I guess the order is not preserved with LIBNAME XLSX, since "table sequence" is not really a concept for most SAS libraries. It might work with something like LIBNAME PCFILES which uses Microsoft APIs to read the Excel file. It would be nice to have some sort of ordinal metadata available.

  22. I have a xls file which has 2 diff column which has values as Chinese character and numeric value i am trying to import using proc import
    proc import datafile="path/china.xls"
    out=tp dbms=XLS replace; sheet=sheet1; GETNAMES=YES;
    run;

    It imports successfully without any error in the log but when i see the sas dataset the column which has Chinese character the column with numeric values have ?? and -e in place of Chinese character displayed

    When i save the xls file as xlsx and run the below code it runs pecfectlt fine with correct values dispalyed

    Note I am running the above code in SAS GRID using EG and has utf 8 encoding set at server level
    proc import datafile="path/china.xlsx"
    out=tp dbms=XLSX replace; sheet=sheet1; GETNAMES=YES;
    run;

    Not able to figure out whats wrong with this file

    • Chris Hemedinger
      Chris Hemedinger on

      DBMS=XLS is a legacy method that does not support the most modern Excel formats. You can use DBMS=EXCELCS if you have a PC Files Server set up, or DBMS=XLSX (as you tried with success).

  23. Hi Chris,
    Your post is very useful! Thank you!
    I also have a few questions on using LIBNAME to read excel data.

    1. Can you say in the code that you just want to read sheet1, regardless of what the name of sheet1 is?
    2. If not, how would you read the data sheet that has a space in the end, such as "Rater_1 " (there is a space after 1)?
    3. How would you read the data sheet whose name has a space in between two words, such as "Rater 1"? I tried the following, but it didn't work.

    OPTIONS VALIDVARNAME = ANY;
    LIBNAME MYDATA XLSX "&DATA_PATH.\&DATA_NAME..xlsx";
    DATA RAWDAT;
    SET MYDATA.'Rater 1$'n;
    RUN;
    LIBNAME MYDATA CLEAR;

    Thank you in advance for your help!

    • Chris Hemedinger
      Chris Hemedinger on

      I find that if you run PROC DATASETS with LIB= pointing to the XLSX library, you'll learn the sheet names as that engine can reference them. It's slightly different than LIBNAME EXCEL or PCFILES.

      • Thank you Chris. I'll check that out :)

        In your 3rd sheet, "YOUTUBE METADATA", how would you read that with a space between the two words? Would you please show an example?

        Thank you!!!

        Ji

        • Chris Hemedinger
          Chris Hemedinger on

          For such names, use the "literal" notation:
          /* read in one of the tables */ data yt; set xl.'YOUTUBE METADATA'n; run; libname xl CLEAR;

  24. Hi,

    This was very helpful. I have an existing program that uses proc sql to import excel files. This article was timely as I had a request where I had to update my program with a small excel file. So, I thought, what a better way to try this new method. It worked to import that one excel file but the proc sql statments follwoing all failed. When I remove the new method of import, my program ran just fine, once again. I find this very odd if the xl libname is cleared.

    Any insight?

    Thanks
    Michelle

    • Chris Hemedinger
      Chris Hemedinger on

      Are you trying to use the Excel data directly from the XLSX libname as a table in your query? If so, I would instead use a DATA step to copy the table to a work data set, then use that in your PROC SQL step.

      LIBNAME XLSX is a convenient way to pull data into SAS, but I think there are limitations for how you can use it directly from the library "in situe".

      • Using SAS 9.4 TS1M4, using HEADER=NO in the LIBNAME statement with the XLSX engine gives an invalid option error. Is there any way using LIBNAME XLSX ... to prevent SAS from assuming the first row is a header row?

        • Chris Hemedinger
          Chris Hemedinger on

          I don't think this option is supported in the XLSX engine. For that, you might need to resort to the PCFILES or EXCEL engine (which uses native APIs or the PC File Server).

    • Chris Hemedinger
      Chris Hemedinger on

      Good question. I think that in general it works okay, because the XLSX file has a little bit of metadata about its fields and SAS should pick that up. However, it's always a good idea to test the edge cases if you have some Excel files with very long cell values.

  25. Hello Chris,
    I was trying to run the following code to create multiple sheets within the Excel file. (My actual code has multiple conditions and creates around 30-40 different datasets in a single DATA SET query)
    I tried with the sample code to see if I can create 'N' number of sheets. But it is throwing me an error about file-lock. Can you please let me know how to rectify this?
    Mine is an UNIX server.
    options validvarname=any;
    libname dsn xlsx ".../Tes.xlsx" ;

    data dsn.class dsn.class2 dsn.class3;
    set sashelp.class;
    if Sex='M' then output dsn.class;
    if Sex='F' then output dsn.class2;
    if Age=12 then output dsn.class3;
    run;

    libname _all_ clear;

    ERROR: Resource is write-locked by another thread. File =..../Tes.1.<br />
Error creating temporary file for XLSX file -> .../Tes.1 . It is either not an Excel spreadsheet or
    it is damaged. Error code=80001019

    • Chris Hemedinger
      Chris Hemedinger on

      This looks like a bug in the XLSX engine. I suggest opening a track with SAS Technical Support. Workarounds: use PROC EXPORT to export multiple sheets, or multiple DATA steps. (Not ideal, I know.)

Leave A Reply

Back to Top