Copy a file using a SAS program: another method

22

A couple of years ago I shared a method for copying any file within a SAS program. It was a simple approach, copying the file byte-by-byte from one fileref (SAS file reference) to another.

My colleague Bruno Müller, a SAS trainer in Switzerland, has since provided a much more robust method. Bruno's method has several advantages:

  • It's coded as a SAS macro, so it is simple to reuse -- similar to a function.
  • It copies the file content in chunks rather than byte-by-byte, so it's more efficient.
  • It provides good error checks and reports any errors and useful diagnostics to the SAS log.
  • It's an excellent example of a well-documented SAS program!

Bruno tells me that "copying files" within a SAS program -- especially from nontraditional file systems such as Web sites -- is a common need among his SAS students. I asked Bruno for his permission to share his solution here, and he agreed.

To use the macro, you simply define two filerefs: _bcin (source) and _bcout (target), then call the %binaryFileCopy() macro. Here is an example use that copies a file from my Dropbox account:

filename _bcin TEMP;
filename _bcout "C:\temp\streaming.sas7bdat";
proc http method="get" 
 url="https://dl.dropbox.com/s/pgo6ryv8tfjodiv/streaming.sas7bdat" 
 out=_bcin
;
run;
 
%binaryFileCopy()
%put NOTE: _bcrc=&_bcrc;
 
filename _bcin clear;
filename _bcout clear;

The following is partial log output from the program:

NOTE: BINARYFILECOPY start  17SEP2013:20:50:33
NOTE: BINARYFILECOPY infile=_bcin C:\SASTempFiles\_TD5888\#LN00066
NOTE: BINARYFILECOPY outfile=_bcout C:\temp\streaming.sas7bdat

NOTE: BINARYFILECOPY processed 525312 bytes
NOTE: DATA statement used (Total process time):
      real time           0.20 seconds
      cpu time            0.07 seconds    

NOTE: BINARYFILECOPY end  17SEP2013:20:50:34
NOTE: BINARYFILECOPY processtime 00:00:00.344

You can download the program -- which should work with SAS 9.2 and later -- from here: binaryfilecopy.sas

Update: using FCOPY in SAS 9.4

Updated: 18Sep2013
Within hours of my posting here, Vince DelGobbo reminded me about the new FCOPY function SAS 9.4. With two filerefs assigned to binary-formatted files, you can use FCOPY to copy the content from one to the other. When I first tried it with my examples, I had problems because of the way FCOPY treats logical record lengths. However, Jason Secosky (the developer for FCOPY and tons of other SAS functions) told me that if I use RECFM=N on each FILENAME statement, the LRECL would not be a problem. And of course, he was correct.

Here's my example revisited:

filename _bcin TEMP recfm=n /* RECFM=N needed for a binary copy */;
filename _bcout "C:\temp\streaming.sas7bdat" recfm=n;
 
proc http method="get" 
 url="https://dl.dropbox.com/s/pgo6ryv8tfjodiv/streaming.sas7bdat" 
 out=_bcin
;
run;
 
data _null_;
   length msg $ 384;
   rc=fcopy('_bcin', '_bcout');
   if rc=0 then
      put 'Copied _bcin to _bcout.';
   else do;
      msg=sysmsg();
      put rc= msg=;
   end;
run;
 
filename _bcin clear;
filename _bcout clear;
Share

About Author

Chris Hemedinger

Director, SAS User Engagement

+Chris Hemedinger is the Director of SAS User Engagement, which includes our SAS Communities and SAS User Groups. Since 1993, Chris has worked for SAS as an author, a software developer, an R&D manager and a consultant. Inexplicably, Chris is still coasting on the limited fame he earned as an author of SAS For Dummies

22 Comments

  1. This is very cool. I would like to use it to copy some .csv files from our Windows LAN to our Citrix/Unix server. Is this possible?

    If so, how do I define the FILENAME to do that?
    For importing/exporting excel files, using a proc import/export we use:

     port=9621  
     server_name="zzzdsmsas001"
     serveruser="&My_User_ID"
     serverpass="&My_LAN_Password" 
    

    (although I realize the interface to PC files probably has different requirements than the Filename statement).

    I always enjoy your blog posts - very informative!

    • Chris Hemedinger
      Chris Hemedinger on

      Mary, it should be possible as long as your UNIX server can reach the Windows file location in some way. This might be a network mounted filesystem, or an FTP server (using FILENAME FTP), or a web server (using FILENAME URL). Here I'm assuming that your SAS is on Unix, while the files you need are in Windows.

      You can also use EG (assuming you've got that in your Citrix environment) and the Copy Files task to achieve a similar end.

  2. Glad to see that we have new functionality with SAS9.4 that makes coding easier.
    One thing I find useful with the macro, you do not have to care for the RECFM option just a fileref to the location where the file sits whether it is disk, FTP, WebDAV etc is sufficient.

  3. Hi ! I have tested the macro restoreTitles but for the line ...where type="T", only one row is select (macro variable SQLOBS resolves to 1) ! How can I fixed it (to have 2 rows select) ?
    The log :

    16         options mprint symbolgen;
    17         
    18         /* Define macro to change titles */
    19         %macro changeTitles;
    20           proc means data=sashelp.class;
    21           title "The sky is the limit";
    22           title2 "for this macro";
    23           footnote "Created by SO";
    24           var weight;
    25         run;
    26         %mend;
    27         
    28         /* Define macro to save titles */
    29         %macro saveTitles;
    30           data _savedTitles;
    31             set sashelp.vtitle;
    32           run;
    33         %mend;
    34         
    35         /* Define macro to restore previously saved titles */
    36         %macro restoreTitles;
    37           proc sql noprint;
    38             /* Using a SAS 9.3 feature that allows open-ended macro range */
    39             select text into :SavedTitles1 from _savedTitles where type="T";
    40             %let SavedTitlesCount = &sqlobs.;
    41         
    42             /* and footnotes */
    43             select text into :SavedFootnotes1 from _savedTitles where type="F";
    44             %let SavedFootnotesCount = &sqlobs.;
    45         
    46             /* remove data set that stored our titles*/
    47             drop table _savedTitles;
    48           quit;
    49         
    50           /* emit statements to reinstate the titles */
    51           TITLE; /* clear interloping titles */
    52           %do i = 1 %to &SavedTitlesCount.;
    53             TITLE&i. "&&SavedTitles&i.";
    54           %end;
    55         
    56           FOOTNOTE; /* clear interloping footnotes */
    57           %do i = 1 %to &SavedFootnotesCount.;
    58             FOOTNOTE&i. "&&SavedFootnotes&i.";
    59           %end;
    60         %mend;
    

    • Chris Hemedinger
      Chris Hemedinger on

      Simon,

      I think you're missing the dash on the macro range:

      select text into :SavedTitles1- from _savedTitles where type="T"
      

      That uses a feature from SAS 9.3. If you have SAS 9.2,
      then you'll first have to do a SELECT COUNT to get the number of "T" records, then fill in the range. That technique is in this post.

  4. Hi, Chris,

    I got a question about bulk file copying done with SAS. The volume of files is over 1 million (tif image files averaging 50 kb each).

    My data step to do this operation takes columns in a table for the file source and destination, then calls the file copy DOS command to do the actual copy. The reason it is done one at a time is because the source and destination paths are structured differently, and each of these files have their own source and destination.

    Below is the line where the copy command is called (after confirming source path exists):
    ------------------------------------------------------------------------------------------------------
    if (fileexist(Source)) then RC=system('copy /y ' !! Source !! ' ' !! Out_FilePath);
    ------------------------------------------------------------------------------------------------------
    Last time it was run, the log showed the resulting times.
    real time 16:42:48.68
    cpu time 1:52:14.81

    Looks like IO bottleneck, but just in case, maybe you know a more efficient SAS code to do this? I looked at SAS configuration, and memsize was set to 2GB and I don't know if I should be playing with those settings (this operation is done on a virtual server).

    • Chris Hemedinger
      Chris Hemedinger on

      That's a ton of files! Since each file has a bit of overhead to copy, I'm not sure how to speed it up. Have you looked into specialized tools like Robocopy or TeraCopy?

      • If the file structure between source and destination were the same, I would use robocopy with the /mir (mirror) option. In fact, I already use it for another process where the source and destination structure are the same.

        Right now the I have to access the SAS table and get the paths one by one (I'm not allowed to change the file name or the path structure). Had an idea of putting path information in the file name (so I don't have to lookup SAS table during copy) then changing it back after the copy is done. But that would just mean extra processing time spent on renaming files twice.

        As for terracopy, my work does not allow me to download other copy tools.

        Question for you, was I correct in assuming this is an IO issue given the SAS log times? The cpu time was how long SAS took to read through the table, and the rest was spent on the actual reading and writing of files?

        real time 16:42:48.68
        cpu time 1:52:14.81

        Thanks for your input.

        • Chris Hemedinger
          Chris Hemedinger on

          My uneducated guess is Yes -- that's I/O time. CPU time is calculations -- building filenames and such.

  5. Hello Chris,

    I successfully used this method for files in the first layer of a zip file with

    filename _bcin "&in_dir.&fn" member="&memname.";

    However members in a subfolder result in

    ERROR: No logical assign for filename _BCIN.
    ERROR: Error in the FILENAME statement.

    Is there some tip for using filename with a zip file and selection a member in a subfolder within the zip file?

    Thank you for any help you can give.

  6. FCOPY, if I recall, does not change in separate operating systems, whereas system commands (Dos copy and Linux/Unix cp, obviously do). In a Windows 10 test, FCOPY (or Dos copy, for that matter) did NOT preserve the datetime metadata and, in fact, the "Date created" value was less than the "Date modified". This can be problematic in certain environments. Renaming using Dos rename did preserve the datetime metadata, so that suggests that zipping the file then moving the zip file then extracting could work, but a test showed that "Date created" then became "Date modified", which in this case was older. Note that the ZIP engine to the FILENAME also did not preserve these metadata; we discussed in the comments about using a System command and a third party compression program: https://blogs.sas.com/content/sasdummy/2015/05/11/using-filename-zip-to-unzip-and-read-data-files-in-sas/

  7. Hello Chris,

    I've got LEHD Employment Statistics files in csv.gz format. I extracted using 7zip, however Excel's row limit (1,048,576) restricts the data. To get the full records, I'd like to read it to SAS, can you please help? Thank you

    I used:
    filename year02_gz ZIP "file_path/mo_od_main_JT00_2002.csv.gz"
    data mylib.Year02;
    infile 7zip(mo_od_main_JT00_2002.csv);
    input w_geocode 16.;
    put _All_;
    run;

    • Chris Hemedinger
      Chris Hemedinger on

      Yes, if the web site has an API for doing so. PROC HTTP with a POST or POUT method can be used to add content to sites that allow it.

  8. Hi Chris, sorry to come to this years late but it sparked my interest. Your blog post "Using FILENAME ZIP to unzip and read data files in SAS" (https://blogs.sas.com/content/sasdummy/2015/05/11/using-filename-zip-to-unzip-and-read-data-files-in-sas/) has an example of reading a SAS data set in a ZIP file that I used successfully, but your reply to Andreas Menrath's 5/25/2015 note had a link to this post where you note that another method was much more robust. That made me nervous so I tried to apply the FCOPY method in this post to a ZIP file.

    First, to verify the FCOPY code, I successfully copied a SAS data set from one location to another using the method who showed above, e.g.,

    libname xxx 'U:\M1XXX00\ziptests\dir1';
    data xxx.one;
      x=1;
    run;
    libname xxx clear;
    filename out1 'U:\M1XXX00\ziptests\dir1\one.sas7bdat' RECFM=N;
    filename in1  'U:\M1XXX00\ziptests\dir2\one.sas7bdat' RECFM=N;
    data _null_;
       length msg $ 384;
       rc=fcopy('out1', 'in1');
       if rc=0 then
          put 'Copied out1 to in1.';
       else do;
          msg=sysmsg();
          put rc= msg=;
       end;
    run;
    

    But, when I tried to use the same code to copy a SAS data set from a ZIP file to a SAS library, I got an error. As noted above, I used your code from the earlier blog post to successfully read from this small ZIP file, which I created manually in WIndows, so I am confident the ZIP file is valid.

    1    filename outzip1 zip "U:\M1XXX00\ziptests\unzip2.zip" member="unzip2/data/class2.sas7bdat"
    1  ! recfm=n; /* Source: in zip file */
    2    filename in1 'U:\M1XXX00\ziptests\dir2\one.sas7bdat' RECFM=N; /* Destination: SAS library */
    3    data _null_;
    4       length msg $ 384;
    5       rc=fcopy('outzip1', 'in1');
    6       if rc=0 then
    7          put 'Copied outzip1 to in1.';
    8       else do;
    9          msg=sysmsg();
    10         put rc= msg=;
    11      end;
    12   run;
    
    rc=-7440230
    msg=WARNING: 0 records were truncated when the FCOPY function read from fileref OUTZIP1. 4 records
     were truncated when the FCOPY function wrote to fileref IN1. To prevent the truncation of records
     in future operations, you can increase the amount of space needed to accommodate the records by u
    sing the LRECL= system option or the LRECL= option in the FILENAME statement.
    NOTE: DATA statement used (Total process time):
    

    I don't understand this error. Should I read the SAS data set in this ZIP file this way, or is the method in the earlier post sufficiently robust? Thanks so much for looking at this.

    • Chris Hemedinger
      Chris Hemedinger on

      Hi Bruce, the DATA step binary copy method (read with INFILE then write to a FILE) is reliable I think. FCOPY() should work in theory, but I have seen cases where it doesn't work properly when trying to copy out of a ZIP member.

      • Thanks Chris, I'll stick to the DATA step binary copy method. And I might have another question in another ~7 years. ;-)

Back to Top