Copy an entire process flow in SAS Enterprise Guide

I've seen some crazy process flows in SAS Enterprise Guide. Crazy-big, and crazy-complex, used by real customers to accomplish real work. But while these process flows represent a ton of work, this is usually a calculated investment to automate processes that would be difficult to capture in another way.

For years, SAS Enterprise Guide users have asked for a way to reuse their process flows in new projects. You have always been able to copy-and-paste individual items (tasks, queries, programs) from one project to another, or from one flow to another in the same project. But when you tried to copy a collection of items from a flow, everything fell apart when you pasted into the new destination. The links/relationships among the objects were not retained, and that sabotaged your goal of saving time and effort.

In SAS Enterprise Guide 7.1, this is finally improved. You can now copy an entire flow, or multiple selected items from a flow, and paste that content intact into another project or process flow.

To copy an entire flow, right-click on the flow name within the Project Tree, then select Copy. In your destination project, right-click on an empty spot within the Project Tree and you'll see that Paste is enabled. When you paste, you'll see the entire flow transfer over, including the links and layout. It's like magic.

To copy just a portion of the process flow, click-drag the cursor to "rubberband" a selection of connected items. (You can also use Ctrl+click to select the items that you want to include.) With the items selected, right-click on one of the selected items and choose Copy. You can then Paste the content into a new or existing process flow.

You can paste a process flow into a new or an existing SAS Enterprise Guide project. Try it for yourself -- see how much time it can save for you!

Note: this feature was added in SAS Enterprise Guide 7.1. The first update (7.11) improved it further (such as capturing project prompts that are referenced in your flow).

Post a Comment

Ask the Expert: Creating custom tasks for SAS Enterprise Guide

If you have not yet discovered the new Ask the Expert series on the SAS Training site, you are missing out on a treasure. Visit the site right now and review all of the available topics, from "Newbie" to Analytics to Visualization to good ol' SAS programming. Go on; I'll wait.

Ask the Expert - they wear glasses, apparently
Welcome back! Amazing, right? You can get lost for hours learning new stuff or just reviewing what you thought you already knew. Some topics are available as live sessions that you can "attend" as they happen, but many of them are available on-demand, for free, no strings attached.

You might have noticed that I have a humble contribution in the "expert" collection: Developing Custom Tasks for SAS Enterprise Guide. During the 37-minute video, I lead you through the uses of SAS custom tasks and the basic steps for creating your own task. You'll learn what custom tasks can do and what they cannot do. You'll learn about the tools and APIs that support the creation of tasks. And you'll see the "inside" of a completed custom task project. (An aside: I owe a big "thank you" to the team that post-processed the recording of my expert talk -- I know that I wasn't that smooth and concise when I recorded it!)

It's a short introduction and watching it won't make you an expert in the topic, but it will help you to decide whether to learn more. You can learn more from my book on this topic, or you can arrange to attend an offering of the two-day course that we offer occasionally. Or you can learn it all on your own, as many have. There are plenty of examples and references to work from. If you're wondering what skills you should have before taking the class, watch my "about this course" video here.

In the "ask the expert" video, I referenced a collection of API libraries that make it easier to set up your custom task projects. These API libraries are available for each version of SAS Enterprise Guide, and they allow you to create tasks that are compatible with multiple versions of the SAS applications, even if you do not have those particular versions installed. (You still need at least one version of SAS Enterprise Guide or SAS Add-In for Microsoft Office in order to test and run your custom task.)

With the permission of the SAS R&D developers, I have made those libraries available here:

>> Download custom task API libraries (ZIP file 333KB)

The README.txt file in the ZIP file explains how to use the libraries.

If you have questions as you start your custom task adventures or if you just want to brag about your successes, post back here or on the SAS Enterprise Guide community. I'd love to hear from you!

Read More »

Post a Comment

Using LIBNAME XLSX to read and write Excel files

When you weren't watching, SAS did it again. We smuggled Yet Another Excel Engine into a SAS release.

SAS 9.4 Maintenance 2 added the XLSX engine, which allows you to read and write Microsoft Excel files as if they were data sets in a library. The big advantage of using this engine is that it accesses the XLSX file directly, and doesn't use the Microsoft data APIs as a go-between. (LIBNAME EXCEL and LIBNAME PCFILES rely on those Microsoft components.) That means that you can use this engine on Windows or Unix systems without having to worry about bitness (32-bit versus 64-bit) or setting up a separate PC Files Server process.

The XLSX engine does require a license for SAS/ACCESS to PC Files. Are you a SAS University Edition user? The SAS/ACCESS product is part of that package, so this technique works there. It's an easy way to get well-formed Excel data into your SAS process.

/* because Excel field names often have spaces */
options validvarname=any;
libname xl XLSX '/folders/myfolders/sas_tech_talks_15.xlsx';
/* discover member (DATA) names */
proc datasets lib=xl; quit;
libname xl CLEAR;

Example output:

Once the library is assigned, I can read the contents of a spreadsheet into a new SAS data set:

/* because Excel field names often have spaces */
options validvarname=any;
libname xl XLSX '/folders/myfolders/sas_tech_talks_15.xlsx';
/* read in one of the tables */
data confirmed;
  set xl.confirmed;
libname xl CLEAR;

And here's the result in my SAS University Edition:

Sometimes you need just one value from a spreadsheet. That's a common use case for dynamic data exchange (DDE), which isn't as feasible as it once was. With the XLSX engine, you can use FIRSTOBS= and OBS= options to control how much data you retain:

/* read in just one value */
data _null_;
  set xl.confirmed (firstobs=6 obs=6 keep='Job title'n);
  call symput('VALUE','Job Title'n);
%put &value;


 76         %put &value;
 Testing Manager,  Quality-driven User Experience Testing

You can also use the XLSX engine to create and update XLSX files.

libname xlout XLSX '/folders/myfolders/samples.xlsx';
data xlout.classfit;
  set sashelp.classfit;
data xlout.air;
  set sashelp.air;
libname xlout clear;

Here is my output in Microsoft Excel with all of these data sets now as sheets:

Remember, you can also create Microsoft Excel files with Base SAS by using ODS EXCEL -- experimental in 9.4 Maintenance 2 but production in Maintenance 3 (coming soon).

The XLSX libname is different from the EXCEL and PCFILES engines in other ways. For example, the XLSX engine does not support Excel named ranges (which can surface a portion of a spreadsheet as a discrete table). Also, you won't see the familiar "$" decoration around the spreadsheet names when they are surfaced in the library within SAS. If you need that sort of flexibility, you can use PROC IMPORT to provide more control over exactly what Excel content is brought into SAS and how.

One other IMPORTANT caution: The XLSX engine is a sequential access engine in that it processes data one record after the other. The engine starts at the beginning of the file and continues in sequence to the end of the file. Some techniques to MODIFY the data in-place will not work. Also, some SAS data viewers cannot render the data from the XLSX engine. SAS VIEWTABLE and SAS Enterprise Guide and even SAS Studio can't open these tables directly in the data grid view. VIEWTABLE gives you a nice message, but SAS Enterprise Guide simply "hangs" in the attempt. For that reason, I recommend using DATA step to copy the Excel content that you want to another SAS library, then CLEAR the XLSX library to avoid accidentally opening a table in a viewer that won't support it. (This is currently a bug in SAS Enterprise Guide that should be fixed in a future release.)

I have found LIBNAME XLSX to be a quick, convenient method to bring in Excel data on any SAS platform. If you have SAS 9.4 Maintenance 2 or later, try it out! Let me know how it works for you by sharing a comment here.

Post a Comment

Filter your SAS Enterprise Guide data with a WHERE clause

Are you a VIEWTABLE fan from the SAS Windowing Environment (a.k.a. Display Manager, DMS, PC SAS)? If so, the latest version of SAS Enterprise Guide has a new feature that you'll love.

With the latest update to SAS Enterprise Guide 7.1 (7.11), you can now subset your data in the data grid by typing a WHERE clause filter. The new Where button appears at the top of the data grid.

Simply click the button, and type a filter expression into the text field. Hit the Enter key to apply the filter:

Just as you might expect from SAS, you can build more complex expressions too. For example, you can transform the variable that you're comparing to lower case so that case-insensitive filters are easier to build:

And if you have continuous variables such as a date variable, you can subset by using BETWEEN to specify the boundaries you want to see:

Or transform that continuous range to a category (such as a YEAR) with a SAS function, on the fly:


SAS Enterprise Guide has always allowed you to subset data easily, but only by adding a step in the process flow (either a Query Builder task or Filter and Sort task). Now, you can accomplish this "in place", saving time and storage as you explore. Let us know what you think of the new feature -- either here in the blog comments or within the SAS Enterprise Guide community.

Post a Comment

Using FILENAME ZIP to unzip and read data files in SAS

I've written about how to use the FILENAME ZIP method to read and update ZIP files in your SAS programs. The ZIP method was added in SAS 9.4, and its advantage is that you can accomplish more in SAS without having to launch external utilities such as WinZip, gunzip, or 7-Zip.

Several readers replied with questions about how you can use the content of these ZIP files within your SAS program. The basic scenario is: "I've got some data files in my ZIP archive. I want to use SAS to unzip these and then use them as data within my SAS process. Can I do this?"

Yes, you can -- but it does require an extra step. Even though FILENAME ZIP can show you the contents and structure of your ZIP file, most SAS procedures cannot access the content directly while it's in the archive. So, the additional step is to copy the file to another location, effectively extracting it from the ZIP file.

As an example, I created a ZIP file with two files and a subfolder:
  |__ sas_tech_talks_15.xlsx
  |__ sas/
      |__ instanttitles.sas7bdat

This SAS program helps me to discover how FILENAME ZIP sees the file:

filename inzip ZIP "c:\projects\";
/* Read the "members" (files) from the ZIP file */
data contents(keep=memname isFolder);
 length memname $200 isFolder 8;
 if fid=0 then
 do i=1 to memcount;
  /* check for trailing / in folder name */
  isFolder = (first(reverse(trim(memname)))='/');
/* create a report of the ZIP contents */
title "Files in the ZIP file";
proc print data=contents noobs N;


        Files in the ZIP file                                         
 memname                       isFolder
 sas/                             1  
 sas/instanttitles.sas7bdat       0  
 sas_tech_talks_15.xlsx           0  
                N = 3

With this information, I can now "copy" the XLSX file out of the ZIP file and then import it into a SAS data set. Notice how I can use the "member" syntax (fileref with the file I want in parentheses) to address a specific file in the ZIP archive. I want to copy just from the actual files, and not the folder-level entries.

/* identify a temp folder in the WORK directory */
filename xl "%sysfunc(getoption(work))/sas_tech_talks_15.xlsx" ;
/* hat tip: "data _null_" on SAS-L */
data _null_;
   /* using member syntax here */
   infile inzip(sas_tech_talks_15.xlsx) 
       lrecl=256 recfm=F length=length eof=eof unbuf;
   file   xl lrecl=256 recfm=N;
   put _infile_ $varying256. length;
proc import datafile=xl dbms=xlsx out=confirmed replace;

Sample output from my SAS log:

NOTE: The infile INZIP(sas_tech_talks_15.xlsx) is:
      Member Name=sas_tech_talks_15.xlsx

NOTE: UNBUFFERED is the default with RECFM=N.
NOTE: The file XL is:
      Filename=C:\SAS Temporary Files\_TD396_\Prc2\sas_tech_talks_15.xlsx,
      RECFM=N,LRECL=256,File Size (bytes)=0,
      Last Modified=11May2015:11:38:59,
      Create Time=11May2015:11:20:23

NOTE: A total of 55 records were read from the infile library INZIP.
NOTE: 55 records were read from the infile INZIP(sas_tech_talks_15.xlsx).
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

To use the SAS data set in the file, I need to copy it into a location shared by a SAS library. In this example, I will again use the WORK location. Because my SAS data set is in a logical subfolder (named "sas") within the archive, I need to include that path as part of the member syntax on the INFILE statement.

/* Copy a zipped data set into the WORK library */
filename ds "%sysfunc(getoption(work))/instanttitles.sas7bdat" ;
data _null_;
   /* reference the member name WITH folder path */
   infile inzip(sas/instanttitles.sas7bdat) 
	  lrecl=256 recfm=F length=length eof=eof unbuf;
   file   ds lrecl=256 recfm=N;
   put _infile_ $varying256. length;
proc contents data=work.instanttitles;

Partial output in my example:

                             Files in the ZIP file                          
                             The CONTENTS Procedure

 Data Set Name        WORK.INSTANTTITLES            Observations          1475
 Member Type          DATA                          Variables             6   
 Engine               V9                            Indexes               0   
 Created              01/29/2015 15:09:54           Observation Length    248 
 Last Modified        01/29/2015 15:09:54           Deleted Observations  0   
 Protection                                         Compressed            NO  
 Data Set Type                                      Sorted                NO  
 Data Representation  WINDOWS_64                                              
 Encoding             wlatin1  Western (Windows)                              

Of course, all of this can be automated even further by writing SAS code that automatically iterates through the ZIP file member names and copies/imports each of the members as needed.

Post a Comment

How to convert a Unix datetime to a SAS datetime

timeI watched with wonder as each of my daughters learned how to "tell time." Early in their primary school careers, they brought home worksheets that featured clock faces with big-hand/little-hand configurations that they had to decipher, and exercises that asked them to draw the hands as they should appear given a specific elapsed time. Now my daughters have digital watches, and have even adopted my habit of wearing them with the face on the underside of the wrist and setting the display to 24-hour time -- which serves to confound their friends.

But, to my disappointment, their school has failed to teach my daughters how to read a SAS datetime value, or even the more ubiquitous Unix (POSIX) datetime value. As a parent of the 21st century, I feel it is my duty to fill this gap in their education.

First, let's review how these datetime values are represented:

  • A SAS datetime value is the number of seconds that have elapsed since midnight of January 1, 1960 (01JAN1960:00:00:00).
  • A Unix (or POSIX) datetime value is the number of seconds* that have elapsed since midnight of January 1, 1970 (01JAN1970:00:00:00).

Conversion from Unix to SAS representation is simple math:

/* Number of seconds between 01JAN1960 and 01JAN1970: 315619200 */
sasDT = unixDT + 315619200;

Having trouble remembering that 9-digit constant? Then you can get SAS to infer that part for you and use the dhms() function:

/* DHMS function calculates datetime when you provide values for */
/*    date, hour, minute, and seconds                            */
/* In this case, "seconds" is a very high value!                 */
sasDT = dhms('01jan1970'd, 0, 0, unixDT);

Raw Unix times are often expressed as UTC, and you might prefer to show local times in your SAS reporting. That's a simple calculation with the (undocumented) gmtoff() function:

/* Convert from UTC to local time using GMTOFF */
sasDT = dhms('01jan1970'd,0,0, unixDT + gmtoff());

Update: Bruno told me that TZONEOFF is an official SAS function in SAS 9.4! See the comments for more details how you can use it instead of GMTOFF.

I have one more variation that I use every day. I have a project that reports against a database that stores transaction times in milliseconds instead of seconds. (This uses the database's intrinsic int8 or BIGINT type.) SAS datetime values can represent fractions of seconds, so this conversion simply requires that I divide by 1000. I can use a different SAS datetime format to see the precision (if that's what I want, though it's usually not).

/* for more precision */
format sasDT datetime22.3; 
/* mlliseconds from transaction database */
unixDTms = 1429193343362;
sasDT_ms = dhms('01jan1970'd,0,0, unixDTms/1000);
/* result: 16APR2015:14:09:03.362 */

Here's a complete program that you can experiment with:

data t;
  format sasDT_const 
    sasDT_local datetime20.;
  /* for more precision */
  format sasDT_ms datetime22.3;
  /* Unix test datetime of 16APR2015:14:09:03 UTC */
  unixDT =   1429193343;
  /* mlliseconds from transaction database */
  unixDTms = 1429193343362;
  /* Number of seconds between 01JAN1960 and 01JAN1970: 315619200 */
  sasDT_const = unixDT + 315619200;
  /* DHMS function calculates datetime given values for */
  /*    date, hour, minute, and seconds                 */
  /* In this case, "seconds" is a very high value!      */
  sasDT_dhms = dhms('01jan1970'd,0,0,unixDT);
  /* converting a value from milliseconds */
  sasDT_ms = dhms('01jan1970'd,0,0,unixDTms/1000);
  /* Convert from UTC to local time using GMTOFF */
  /* use TZONEOFF function in 9.4! */
  sasDT_local = dhms('01jan1970'd,0,0,unixDT + gmtoff()); 

Sample output from this program (which I ran from my EDT timezone):
* I intentionally avoided any discussion of leap seconds that go along with UTC, but you can learn more about that here.

See also

Read a Microsoft datetime value into a SAS datetime value

Post a Comment

How to compare SAS programs in SAS Enterprise Guide

Copy/paste is my favorite method for creating new SAS programs. In my work projects, I maintain a sort of genealogy of SAS programs, because the DNA of one program can be used to spawn many other SAS programs as its progeny. When things (inevitably) aren't working as I intend in one of these next-generation programs, I find that I need to compare that program against the original to see what I've missed.

My file-comparison tool of choice is WinMerge. It's free and easy to install and use. That's why I'm really pleased with the new Compare Programs feature in SAS Enterprise Guide 7.1.

Here is how you set it up.

1. Install a file comparison tool.

I like WinMerge. But some of my colleagues prefer Beyond Compare or KDiff. Your choice! Most of these tools can integrate with your file system so that you can compare any two files from Windows Explorer. They also integrate with source control systems, if you use those (Git, Subversion, etc.).

2. Set your File Comparison options in SAS Enterprise Guide

Select Tools->Options, then the File Comparison tab. Check "Use custom file comparison tool" and then select your tool of choice. There are built-in presets for the three comparison tools that I mentioned, or you can go "off road" and select your favorite with "Other".

Select your DIFF tool

3. Select the two files that you want to compare

In the Project Tree or Process Flow, you can select multiple programs by using the standard Ctrl+Click sequence or by "rubber banding" a selection with the mouse (Process Flow only). When you have exactly two programs selected, right-click and select Compare.

Compare programs menu
The comparison tool launches with the contents of each program loaded for comparison. Here's what mine looks like:

Code compare in action
Some additional notes:

  • The content you see in the comparison tool is a copy of your program. Therefore, if you use the comparison tool to make changes, these won't be reflected in your SAS Enterprise Guide project. Of course, you can copy the changed content to the Windows clipboard and paste it into the SAS Enterprise Guide program editor.
  • You can do more with file comparison when using the Program History feature, which I introduced in this blog post about new programming features in SAS Enterprise Guide 7.1.
Post a Comment

Project Reviewer task and Copy Files task: refreshed for v7.1

Two popular SAS custom tasks have recently been updated for SAS Enterprise Guide 7.1. Most custom tasks that I've shared will work without modification across releases, but these two required a special rebuild due to some internal product API changes.

The Project Reviewer task allows you to see a detail view of the tasks/programs within your SAS Enterprise Guide process flows. The listing shows the task names and types, when they were added to the project and last modified, and how much time they take to run. You can also create a SAS-based ODS report from the data, which is a great way to use SAS Enterprise Guide to report on your SAS Enterprise Guide project. (Wow, that's so meta!)

Download Project Reviewer: See this blog post for more information and download link

The Copy Files task is used to move files from your local PC to your SAS session, or from your SAS session to your local PC. This allows you to capture these file transfer steps within your process flow so that they are repeatable. The task supports the use of macro variables and wildcards, which gives you lots of flexibility when designing a process in which the file names and quantities are subject to change.

Copy Files task
Download Copy Files: See this blog post for more information and download link

If you use these tasks (or other custom tasks) and find them useful, please leave a comment! I love to hear how the tasks are being used. And sometimes useful custom tasks find their way into the actual product in a future release. Your feedback is important to help to make that happen.

Other related articles

Post a Comment

Can you learn how to create custom tasks?

To develop a custom task for use in SAS Enterprise Guide (or SAS Add-In for Microsoft Office), you need a few things:

  • A working knowledge of SAS and SAS Enterprise Guide
  • Microsoft Visual Studio (the free Express edition will do, though the Professional edition is better)
  • Some experience with C# or Visual Basic -- or at least with an object-oriented framework such as Java or C++.

It helps to have a "how to" guide, and I've written one of those. And if you're the sort of person who learns best in the classroom, you can now take a class.

Are you wondering whether the class right for you? Watch this video (about 4 minutes long, in which I do my best Troy McClure impersonation). I'll explain what custom tasks are and what you need to know in order to build them. I'll also explain what you can expect to learn from the class, and the skills you should have before you enroll.

To enroll:
Visit the SAS Training site and find an offering that works with your schedule and location.

Post a Comment

How to split one data set into many

splitdataBack in the day when the prison system forced inmates to perform "hard labor", folks would say (of someone in prison): "He's busy making little ones out of big ones." This evokes the cliché image of inmates who are chained together, forced to swing a chisel to break large rocks into smaller rocks. (Yes, it seems like a pointless chore. Here's a Johnny Cash/Tony Orlando collaboration that sets it to music.)

SAS programmers are often asked to break large data sets into smaller ones. Conventional wisdom says that this is also a pointless chore, since you can usually achieve what you want (that is, process a certain subset of data) by applying a WHERE= option or FIRSTOBS=/OBS= combination. Splitting a data set creates more files, which occupy more disk space and forces more I/O operations. I/O and disk access is often the most expensive part of your SAS processing, performance-wise.

But if the boss asks for broken-up data sets, you might as well spend the least possible effort on the task. Let's suppose that you need to break up a single data set into many based on the value of one of the data columns. For example, if you need to break SASHELP.CARS into different tables based on the value of Origin, the SAS program would look like:

DATA out_Asia;
DATA out_Europe;

I'm going to admit right now that this isn't the most efficient or elegant method, but it's something that most beginning SAS programmers could easily come up with.

Writing the above program is easy, especially since there are only 3 different values for Origin and I've memorized their values. But if there are more discrete values for the "split-by" column, the task could involve much more typing and has a high possibility for error. This is when I usually use PROC SQL to generate the code for me.

If you've read my article about implementing BY processing for an entire SAS program, you know that you can use PROC SQL and SELECT INTO to place data values from a data set into a macro variable. For example, consider this simple program:

proc sql;
 select distinct ORIGIN into :valList separated by ',' from SASHELP.CARS;

It creates a macro variable VALLIST that contains the comma-separated list: "Asia,Europe,USA".

But we can use SAS functions to embellish that output, and create additional code statements that weave the data values into SAS program logic. For example, we can use the CAT function to combine the values that we query from the data set with SAS keywords. The results are complete program statements, which can then be referenced/executed in a SAS macro program. I'll share my final program, and then I'll break it down a little bit for you. Here it is:

/* define which libname.member table, and by which column */
%let COLUMN=origin;
proc sql noprint;
/* build a mini program for each value */
/* create a table with valid chars from data value */
select distinct 
   cat("DATA out_",compress(&COLUMN.,,'kad'),
   "; set &TABLE.(where=(&COLUMN.='", &COLUMN.,
   "')); run;") into :allsteps separated by ';' 
  from &TABLE.;
/* macro that includes the program we just generated */
%macro runSteps;
/* the macro when ready */

Here are the highlights from the PROC SQL portion of the program:

  • SELECT DISTINCT ensures that the results include just one record for each unique value of the variable.
  • The CAT function concatenates a set of string values together. Note that CATX and CATS and CATT -- other variations of this function -- will trim out white space from the various string elements. In this case I want to keep any blank characters that occur in the data values because we're using those values in an equality check.
  • The program calculates a name for each output data set by using each data value as a suffix ("OUT_dataValue"). SAS data set names can contain only numbers and letters, so I use the COMPRESS function to purge any invalid characters from the data set name. The 'kad' options on COMPRESS tell it to keep only alpha and digit characters.
  • The resulting program statements all end up in the &ALLSTEPS macro variable. I could just reference the &ALLSTEPS variable in the body of the SAS program, and SAS would run it as-is. Instead I chose to wrap it in the macro %runSteps. This makes it a little bit easier to control the scope and placement of the executable SAS program statements.

"By each value of a variable" is just one criterion that you might use for splitting a data set. I've seen cases where people want to split the data based on other rules, such as:

  • Quantity of observations (split a 3-million-record table into 3 1-million-record tables)
  • Rank or percentiles (based on some measure, put the top 20% in its own data set)
  • Time span (break up a data set by year or month, assuming the data records contain a date or datetime variable)

With a small modification, my example program can be adapted to serve any of these purposes. What about you? Are you ever asked to split up SAS data sets, and if so, based on what criteria? Leave a comment and tell us about it.

Post a Comment