Using FILENAME ZIP and FINFO to list the details in your ZIP files

9

It's time to share another tip about working with ZIP files in SAS. Since I first wrote about FILENAME ZIP to list and extract files from a ZIP archive, readers have been asking for more. Specifically, they want additional details about the files that are contained in a ZIP, including the original file datetime stamps, file size, and compressed size. Thanks to a feature that was quietly added into SAS 9.4 Maintenance 3, you can use the FINFO function to retrieve these details. In this article, I share a SAS macro program that does the job.

Here's an abridged example of the output. If you need to create something like this without the use of external ZIP tools like 7-Zip or WinZip (which are often unavailable in controlled environments), read on.

FILENAME ZIP output

You can download the full program from my public gist on GitHub: zipfiles_list_details.sas

ZIPpy details: a solution in three macros

Here's my basic approach to this problem:

  • First, create a list of all of the ZIP files in a directory and all of the file "members" that are compressed within. I've already shared this technique in a previous article. Like an efficient (or lazy) programmer, I'm just reusing that work. That's macro routine #1 (%listZipContents).
  • With this list in hand, iterate through each ZIP file member, "open" the file with FOPEN, and gather all of the available file attributes with FINFO. I've divided this into two macros for readability. %getZipMemberInfo (macro routine #2) retrieves all of the file details for a single member and stores them in a data set. %getZipDetails (macro routine #3) iterates through the list of ZIP file members, calls %getZipMemberInfo on each, and concatenates the results into a single output data set.

Here's a sample usage:

  %listzipcontents (targdir=C:\Projects\ZIPPED_Examples, outlist=work.zipfiles);
  %getZipDetails (inlist=work.zipfiles, outlist=work.zipdetails);

I tried to add decent comments to my program so that interested coders can study and adapt as needed. Here's a snippet of code that uses the FINFO function, which is really the important part for retrieving these file details.

/*
 Assumes an assignment like:
  FILENAME F ZIP "C:\ZIPPED_Examples\SudokuSolver_src.zip" member="src/AboutThisProject.txt";
*/
fId = fopen("&f","S");
if fID then
  do;
   infonum=foptnum(fid);
     do i=1 to infonum;
      infoname=foptname(fid,i);
      select (infoname);
       when ('Filename') filename=finfo(fid,infoname);
       when ('Member Name') membername=finfo(fid,infoname);
       when ('Size') filesize=input(finfo(fid,infoname),15.);
       when ('Compressed Size') compressedsize=input(finfo(fid,infoname),15.);
       when ('CRC-32') crc32=finfo(fid,infoname);
       when ('Date/Time') filetime=input(finfo(fid,infoname),anydtdtm.);
      end;    
   end;
 compressedratio = compressedsize / filesize;
 output;
 fId = fClose( fId );

The FINFO function in SAS provides access to file attributes and their values for a given file that you've accessed using the FOPEN function. The available file attributes can differ according to the type of file (FILENAME access method) that is used. ZIP files, as you can guess, have some attributes that are specific to them: "Compressed Size", "CRC-32", and others. This code checks for all of the available attributes and keeps those that we need for our detailed output. (And see the use of the SELECT/WHEN statement? So much more readable than a bunch of IF/THEN/ELSEs.)

Look, I'm not going to claim that my approach to this problem is the most elegant or most efficient -- but it works. If it can be improved, then I'm sure I'll hear from a few of you experts out there. Bring it on!

For more about ZIP files in SAS

Share

About Author

Chris Hemedinger

Director, SAS User Engagement

+Chris Hemedinger is the Director of SAS User Engagement, which includes our SAS Communities and SAS User Groups. Since 1993, Chris has worked for SAS as an author, a software developer, an R&D manager and a consultant. Inexplicably, Chris is still coasting on the limited fame he earned as an author of SAS For Dummies

9 Comments

  1. Dear Chris,

    I suspect it won't, but I suppose it can't hurt to ask: Does this work with gzip, i.e. in a UNIX type environment?

    Thank you,

    Jim

    • Chris Hemedinger
      Chris Hemedinger on

      Jim, that's coming soon! GZ support hits at SAS 9.4 Maint 5, mere weeks away. I'll have to update all of these blog posts then.

      • Chris! That's great news! Have I mentioned how intelligent and good looking you are, lately? Keep up the good work.

        Jim

  2. Hi Chris,

    I am trying to read S3 files from AWS in a SAS BASE program. Is that possible?
    I have found in documentaton 'proc s3' procedure, but it's not avalilable in my ver (SAS EG 7.11)
    Do yo know any other way?

    Thanks

  3. Just an FYI. As XiaobinDC pointed out, there is a typo in the _zipfiles step where the closing paren for lowcase in misplaced. Took me a while to figure out why that step wasn't working.

  4. Hi Chris,

    first of all I would like to say that the idea of using FINFO to obtain file information from a zip archive member is really nice. The implementation and the need of using a sequential read per zip member makes it rather slow when the zipfile contains a combination many file members.. Therefore I have implemented the ZIP member FINFO in such a way, that it is only used if the zip archive contains one single file. Here is the link to the macro I created: https://github.com/paul-canals/toolbox/tree/master/custom/create_zip%20(standalone%20version).

    Thanks & best regards,
    Paul

Back to Top