Using SAS and ODS PACKAGE to create ZIP files

SAS users are big data consumers and big data creators. Often, we have to deal in large data files (or many smaller files) -- and that means ZIP compression. ZIP compression tools such as gzip, 7-Zip, and WinZip are ubiquitous, but they aren't always convenient to use from within a SAS program. To use an external ZIP utility you must issue a shell command via the X command or SYSTASK function, and that's not always possible within today's complex SAS environments.

Fortunately, SAS can read and write ZIP files directly. Ever since SAS 9.2, we've been able to create ZIP files with ODS PACKAGE. Beginning with SAS 9.4, we can read ZIP content by using FILENAME ZIP.

In this post, I'll review how to create ZIP files using ODS PACKAGE. I'll cover reading ZIP files with FILENAME ZIP in a future post.

Let's pretend that I'm working for a government agency, and that part of my job is to crunch some government data and publish it for the public. Of course, I'm using SAS for the analysis, but I need to publish the data in a non-proprietary format such as CSV. (It seems unbelievable, I know, but not every citizen is lucky enough to have access to SAS.)

First, I'll set up the output directory for this project. Since the ZIP file will contain a couple of files, including a subfolder, I want to mirror that structure here. The FEXIST and FDELETE functions will delete an existing ZIP file (perhaps left over from the last time I ran the process). The DLCREATEDIR option will create a "data" subfolder as needed. All of these mechanisms interact with the file system, but do not require XCMD privileges. This means that they'll work in SAS Enterprise Guide and stored processes.

%let projectDir = c:\projects\sgf2013\filenamezip;
/* Clean slate! */
filename newfile "&projectDir./";
data _null_;
  if (fexist('newfile')) then 
  	rc = fdelete('newfile');
filename newfile clear;
/* Create folder if it doesn't exist */
options dlcreatedir;
libname out "&projectDir./data";

Next, I need to create the content to include in the ZIP file. In this scenario, I'm crunching some heavy-duty numbers about Cars data, and then putting the results into a CSV file. Then I'm creating a README file in RTF format; the document contains a simple data dictionary plus instructions (such as they are) for using the data. I used ODS TEXT to throw in some ad-hoc text among the SAS output.

/* Create some data */
filename newcsv "&projectDir./data/pct.csv";
proc means noprint;
var msrp;
output out=out.pct median=p50 p95=p95 p99=p99;
ods csv file=newcsv;
proc print data=out.pct;
format _all_; /* clear the formats */
ods csv close;
/* Create an informative document about this package */
filename rm "&projectDir./readme.rtf";
ods rtf(readme) 
  file="&projectDir./readme.rtf" style=Printer;
ods rtf(readme) 
  text="These are some instructions for what to do next";
proc datasets lib=out nolist;
contents data=pct;
ods rtf(readme) close;

Finally, I'm going to take those results and package them in a ZIP file. The ODS PACKAGE mechanism was originally designed to share results from a SAS stored process. By default, it adds a PackageMetaData entry that a consuming SAS application could use to interpret the result. In this case we don't need this entry; the NOPF option suppresses it.

Notice that I specify the PATH= option to place the CSV file in the "data" folder within the archive. As soon as the ODS PACKAGE CLOSE statement executes, the ZIP file is created.

/* Creating a ZIP file with ODS PACKAGE */
ods package(newzip) open nopf;
ods package(newzip) add file=newcsv path="data/";
ods package(newzip) add file=rm;
ods package(newzip) publish archive 
ods package(newzip) close;

Here's a screen shot of the ZIP file opened in WinZip:

That's it! I can add any file that I want to the ZIP archive; I'm not restricted to files that were created by SAS. This makes it easy to use SAS as an automated method to update data archives regularly, creating user-friendly packages for consumers to make use of our data.

Note: A common question: does ODS PACKAGE (and FILENAME ZIP) support password-protected ZIP files (encryption)? The answer is No. If that's a requirement, you'll need to use an external package such as 7-Zip.

Download the complete program (SAS 9.3 or later):

You might also enjoy:

Post a Comment

Customer segmentation using RFM in SAS Enterprise Guide

Recency, Frequency, and Monetary Analysis (or RFM) is a popular customer segmentation technique employed by database marketers everywhere. Marketers use RFM to identify which customers are most likely to respond to a direct marketing campaign. The model takes into account three simple metrics:

  • How recently did the customer buy from you?
  • How frequently does the customer buy something from you?
  • How much money does the customer spend on your products?

Each metric receives a value of 1 through 5. The result is 125 "bins" of customers (because 125 is 53). Those with higher RFM scores are considered more likely to respond to a campaign...potentially.

For years, SAS customers have used special tools like SAS Enterprise Miner to compute RFM. With SAS 9.4, the RFM algorithms are built into Base SAS, and there's an easy-to-use task in SAS Enterprise Guide. (The task is also available in the SAS Add-In for Microsoft Office.) You can find the task in the menus at Tasks->Data Mining->Recency, Frequency, and Monetary Analysis.

Note: To use the task, you must have SAS 9.4 and SAS Enterprise Guide 6.1 (or SAS Add-In for Microsoft Office 6.1). Despite the "Data Mining" category, this task does not require SAS Enterprise Miner.

As an example, suppose you have transaction data that looks like the following. You need only the 3 fields -- a customer ID, a transaction date, and a transaction amount (value):

From this, RFM calculates "scores" for each customer. The customers with the highest scores will probably be those that spent the most with you, across the most recent and frequent dates. The idea behind RFM is that a minority of customers are responsible for a majority of your business. RFM scores provide visibility into who those valuable customers are. Here's an example of the scored data, summarized at the customer level:

The RFM task supplies several useful charts. Here's a "monetization map", which summarizes the monetary values for each combination of frequency and recency scores. You might use this to help identify a "sweet spot" of customers that you want to target.

Next, let's look at a paneled bar chart of the Frequency by Recency segments. The bar on the lower right corner indicates that there are a handful of customers who made several purchases in the past (high frequency), but that was a long time ago (not recent). Perhaps that's a good target segment for a "Come back and see us -- we miss you" campaign. Contrast this with the bar on the upper right, which shows the 60 superfans: the customers who bought lately and often. You can decide whether to "go back to the well" with this group in the next campaign, or save the campaign expense as they might buy from you anyway, without prompting.

RFM scores are just one small part of planning a campaign. The "Recency, Frequency, and Monetary Analysis" task is a good start, but eventually you might want to factor in other criteria.

After all, direct marketing has many nuances, such as cross-referencing with opt-out lists and taking steps to avoid "overmarketing" to any one segment. Tracking response rates, testing campaigns, and the actual campaign workflow are also essential elements. When you're ready, SAS Customer Intelligence offers an integrated set of applications for all of these aspects.

Post a Comment

A fresh helping of hash: the SHA256 function in SAS 9.4m1

For several releases, SAS has supported a cryptographic hash function called MD5, or "message digest". In SAS 9.4 Maintenance 1, the new SHA256 function can serve the same purpose with a better implementation.

The job of a hash function is to take some input (of any type and of any size) and distill it to a fixed-length series of bytes that we believe should be unique to that input. As a practical example, systems use this to check the integrity of file downloads. You can verify that the published hash matches the actual hash after downloading.

Sometimes a hash is used to track changes in records within a database. You first calculate a hash value for each data record based on all of the fields. Periodically, you recheck those calculations. If a hash value changes for a data record, you know that some part of that record has changed since the last time you looked.

Here's another common use: storing passwords in a database. Because you can't (theoretically) reverse the hash process, you can use a hash function to verify that a supplied password is the same as a value you've stored, without having to store the original clear-text version of the password. It's not the same as encryption, because there is no decryption method that would compromise the original supplied password value.

MD5 has known vulnerabilities, especially with regard to uniqueness. A malicious person can use a relatively low-powered computer to compute an input that produces an identical hash to one you've stored, thus compromising the algorithm's usefulness.

Enter the SHA256 algorithm. It's the same idea as MD5, but without the known vulnerabilities. Here's a program example:

data _null_;
  format hash $hex64.;
  hash = sha256("SHA256 is part of SAS 9.4m1!");
  put hash;

Output (formatted as hexadecimal so as to be easier on the eyes than 256 ones-and-zeros):


As the name implies, it produces a value that is 256 bits (32 bytes) in size, as compared to 128 bits from MD5. Here's a useful article that compares the effectiveness of hash algorithms.

The SHA256 function was added to SAS 9.4 Maintenance 1. If you've been wanting to hash your data in SAS, but you've been poo-pooing the MD5 function -- well, now is your chance!

Post a Comment

Adding Google Web Fonts to your SAS reports

Because I began my SAS career in the Publications division, I like to think that I have a keen eye when it comes to SAS documentation. When I first visited the SAS 9.4 online documentation, I immediately noticed that it had a different look. Examine the image below; can you see what I mean? (Click on the image to see a larger version.)

Aside from a few layout enhancements and more helpful details, I noticed the use of a different typeface for the text. It's a font that I didn't recognize, and I wondered how it had been installed on my system.

It turns out that this font face (named "Lato") is not installed on my system; it's a web font. A web font is downloaded and rendered "on the fly" by your web browser. This gives the web designer more control over the exact appearance of the text, even across different operating systems and devices that share only a small subset of generic font styles.

There are different sources for web fonts. Some are free to use, while others require a nominal licensing fee. If you have a particular typeface that is important to your company brand, it might be worth a licensing fee to ensure that this typeface is used consistently in all of your web content. However, the SAS documentation (and many other sites) use the free Google Fonts.

After I learned all of this I wondered: how can I use web fonts in SAS ODS output?

Specifying a web font in ODS HTML

There are three ways to "import" a web font in your HTML content:

  1. using a <link> tag to reference a directive from an external style
  2. using an @import directive from within a CSS file or <style> tag
  3. using JavaScript to dynamically insert a web font style reference into the page.

The Google Font web site provides code snippets for each of these in HTML.

Once you import the font, you must then reference the font name in CSS "font-family" style attributes for the different element classes that you want to affect. For example, if you want ODS tabular data to use the Lato font, you must change the "data" class to include it:

.data {
  font-family: 'Lato', sans-serif;

I decided that the <link>-tag approach was the simplest method to import the web font. I copied the <link>-tag directives from the Google Fonts entry for Lato and Droid Sans, and then "injected" them into the ODS HTML output by using the HEADTEXT option. Then I used PROC TEMPLATE to modify the style attributes for specific ODS-related style classes; these attributes will translate into CSS when SAS creates your HTML. Here's the program:

/* These snippets copied from facility    */
/* Macro for HEADTEXT option, since the value cannot exceed 256 chars */
%macro ods_html_webfont;
<link href=',700' 
  rel='stylesheet' type='text/css'>
<link href=',700' 
  rel='stylesheet' type='text/css'>
proc template;
  define style webfont;
    /* for this example, inherit existing style elements from HTMLBLUE */
  style header from header /
    fontfamily="'Droid Sans', sans-serif";
  style data, body from _self_ /
    fontfamily="'Lato', sans-serif";
  class titlesandfooters /
    fontfamily="'Lato', sans-serif";
ods html (id=wf)
   file="c:\temp\wf_test.html"(title="Web Fonts Test") 
title "A new look for my report";
proc means;
ods html (id=wf) close;

Here's the result as seen in my Chrome browser:

(Want to compare this to a version that doesn't use web fonts?)

Limitations of web fonts

Before you consider using web fonts in all of your SAS-generated content, there are a few restrictions that you should review:

  • Web fonts can be used only in HTML output -- output that you intend to display in a web browser. The browser will download and render the font based on CSS or JavaScript directives. This means that this technique won't work for RTF or PDF output.
  • Web fonts can apply to textual content only, and not to images that are generated by SAS graphical procedures. SAS graphical output is usually rendered into an image file (such as a PNG file) within your SAS session. The appearance is controlled by SAS styles that are defined in your SAS session, and any referenced fonts must be accessible to SAS.
  • Because web fonts must be downloaded by the browser as the HTML page loads, this can have an impact on how quickly the page is rendered. Each Google web font provides some guidance about this potential impact. For the best response, include references to the minimum number of typefaces that you need for the content.
  • And of course, for a web font to download you must be connected to the Internet. It's a good idea to always specify a fallback font family (ex: sans-serif) in your styles so that even if your web font can't load, your style still provides some cue for how to render the text.
Post a Comment

SAS Tech Talk: Developing the SAS Visual Analytics Explorer

If there's anyone who represents the global nature of SAS software, it's Falko Schulz. He's a German who lives in Brisbane, Australia while he works for SAS R&D based in Cary, NC.

Falko works on the team that produces SAS Visual Analytics, specifically the "explorer" portion of the tool. He brings a ton of SAS knowledge and real-world experience to his role. In fact, you can see Falko demonstrate his SAS-coding chops in his recent posts on SAS blogs:

While he's creative and brilliant with all-things-SAS, I find Falko to be humble and personable. That's why I was excited to host him on a SAS Tech Talk session at SAS Global Forum last year. During our interview, Falko discusses how he manages working with a team while located half-a-world away. He also shows off several of the new features in SAS Visual Analytics Explorer, including some of the new visualizations and the built-in forecasting methods.

Watch the video to learn more about SAS Visual Analytics Explorer and the people who build it.

Post a Comment

SAS custom tasks: using Microsoft Visual Studio 2013 and other new toys

Writing a book about SAS (or any computer software) is both gratifying and frustrating. It's gratifying because there are so many SAS users in the world, and it feels good when you can share your expertise with them. It's also gratifying to have a volume that you can point to and say "I wrote that!"

But it's frustrating too, because computer software changes rapidly, and your book can quickly become out-of-date. Some people might call that "job security". Unfortunately, constantly revising and republishing my books isn't part of my "day job", and while rewarding in many ways, it doesn't pay the bills.

When we published Custom Tasks for SAS Enterprise Guide using Microsoft .NET, the current versions of software were:

  • SAS Enterprise Guide 5.1
  • SAS 9.3
  • Microsoft Visual Studio 2012 (very recently released at the time!)

In just 11 months since then, we are now working with SAS 9.4, SAS Enterprise Guide 6.1, and Microsoft Visual Studio 2013. I don't address these versions in my book. So readers wonder: can you use the book with the later editions of software?

The answer is Yes. The material in the book still applies, one-hundred-percent. You need to adjust the book instructions only in the most obvious ways. That is, where there is an instruction that includes a specific version number (a folder name, for example), you need only to substitute the more recent version number.

The Microsoft Visual Studio templates, which I provide as a simple way to bootstrap your custom task project, still work as they always have. I originally designed them to work with Microsoft Visual Studio 2008 and SAS Enterprise Guide 4.2. Since most users are well beyond those versions now, I've created an updated set of templates that work better with Microsoft Visual Studio 2010 and later, and SAS Enterprise Guide 4.3 and later. You can find the updated Visual Studio templates on the SAS support page for custom tasks. The download package includes templates for C# and Visual Basic, plus illustrated instructions for installation and use. The templates work with the 2010, 2012, and 2013 editions of Microsoft Visual Studio, including the free Express editions.

Custom task project in VS2013

Note: while you can use the free Express editions of Microsoft's development tools, I always recommend the professional editions for serious long-term work. The professional editions include better support for debugging task code, a feature that I find to be essential for production work.

Post a Comment

11 super-useful custom tasks for SAS Enterprise Guide

The holiday season is here, and you're probably wondering how to shop for that picky SAS Enterprise Guide user on your gift list. I've got a few ideas for you, and the price is right! Here are links to 11 custom tasks that are free to download, easy to install, and so useful that even the most finicky SAS user will soon find them indispensable.

For programmers - 3 tasks

SAS Enterprise Guide has a great program editor, but its missing some of the utility windows that SAS programmers are accustomed to. These tasks provide those windows, and more, in a way that fits in to your flow.

SAS Catalog Explorer
Need a quick way to check the contents of your SAS catalogs? This task allows you to see your catalog contents, including your compiled macros, SAS formats, SOURCE entries, and more.
Download: See this blog post for more information and download link.

SAS System Options Viewer
Just like the OPTIONS window in traditional SAS, but better! This window shows all of the SAS system options and their current values, plus information about how the values were set. It also provides a filter field for easy searching.
Download: See this blog post for more information and download link.

Macro Variable Viewer
This one is unique to SAS Enterprise Guide. See the values for all automatic and global SAS macro variables in your session. It provides a simple "immediate evaluation" window that allows you to test any valid macro expression. Like the options window, there's also an easy way to filter the view to just the variables you want.
Download: See this blog post for more information and download link.

For project people - 2 tasks

Feeling introspective? These tasks allow you see what's inside your SAS Enterprise Guide project.

Project Reviewer
This task provides a listing of each task in your project and process flows. You can see how long each task took to run, as well as the time for each process flow. You can even use this task to build a "project report" right into your project. (That's so meta!)
Download: See this blog post for more information and download link.

Task Property Viewer
This task shows you the "guts" of the other tasks within your project. Each task has saved settings that control how it will behave when you run it, and the Task Property Viewer provides you a glimpse into those settings. This task is a must-have for any developer who is building a custom task, since it shows you exactly what you're saving into the project. It's a powerful debugging tool.
Download: This task is supplied as support material for Chapter 10 of my book, Custom Tasks for SAS Enterprise Guide using Microsoft .NET. You can download it as a sample project from the book's examples page.

For command and control - 3 tasks

Sometimes you need SAS Enterprise Guide to do work that isn't captured in a SAS program. These tasks allow you to capture special custom activities into your process flow.

Copy Files to/from SAS Session
Need to move files from here to there? If you need to copy files (FTP style) among your PC and the SAS server, this task captures that step. Use wildcards and SAS macro variables to specify which files to move to where. (This is one of the most popular custom tasks.)
Download: See this blog post for more information and download link.

Message Prompt
If you need to pop a message in the middle of your process flow, this is the task for you. It offers a few options to capture a simple response from the user, so you can use this to ask a Yes/No question, solicit a simple text value or ask for a password.
Download: See this blog post for more information and download link.

System Command
If you need the ability to run a local Windows command during your process flow, this task does the trick. Supply the commands the way you might craft a Windows BAT file. (Remember how to do that?)
Download: See this blog post for more information and download link.

For working with data - 3 tasks

These tasks provide handy data-centric utilities, helping you to do more with your SAS data.

DATA Step from Data Set
This task reverse-engineers a data set into a SAS program -- using the SAS DATA step. It creates a data set header with all of your column metadata, and then includes all of your data records in DATALINES.
Download: See this blog post for more information and download link.

Compress SAS Data
Are your data sets hogging up too much disk space? This Compress task applies two techniques to make your data smaller. First, it analyzes your text fields and optimizes the allocated field length to just what you need. Next, you can optionally use the COMPRESS feature of the SAS data set to let SAS compress your data set file.
Download: See this blog post for more information and download link.

Calculate Data Cardinalities
This task shows you how many levels (distinct values) each of your data columns contains. This is an important step towards understanding your data so that you know what types of analyses are appropriate for each of your data columns.
Download: See this blog post for more information and download link.

There -- that list goes to 11. That should be enough, but I can't resist adding just two more tasks as a bonus feature!

Bonus tasks - 2 more for fun

All of the tasks mentioned so far are useful and fun in their own way, if you get a kick out of that sort of thing. Here are a couple of bonus tasks designed especially for fun.

Solve your Sudoku
This was one of my first example custom tasks, built back in 2007. Still works great!
Download: See the examples page for my book -- look in the BONUS area..

Run PROCs on your Facebook Friends
I've had a lot of fun with this one -- and you can too. Get new insight on your Facebook friend connections!
Download: See this blog post for more information and download link.

Post a Comment

Support for SAS on Windows 8 and Windows 8.1

Microsoft Windows 8 has been with us for a year, and its first major update -- Windows 8.1 -- has just arrived. So how does SAS support these Windows 8 platforms?

The answer can be found on in SAS Note 46876. I'll summarize it here: SAS 9.3 and SAS 9.4 are supported on Windows 8 and 8.1, as well as on Windows Server 2012. SAS 9.4 has been supported since its initial release in July 2013. SAS 9.3 requires a baseline version of 9.3 Maint 2, revision 13w18 (April 2013) or later. (That's primarily due to the additional support for Java 7, which is required for SAS use on Windows 8.)

SAS Enterprise Guide 5.1 and 6.1 are also supported on Windows 8/8.1. The same holds true for the SAS Add-In for Microsoft Office 5.1 and 6.1. Windows 8.1 adds Internet Explorer 11 to the mix, which should perform fine for your web-based SAS applications. The browser support matrix is tricky to keep up with since browsers rev often, and it usually happens "out-of-band" with operating system updates. If your browser flavor/version isn't listed as supported that doesn't mean it won't work, but it does mean that SAS R&D hasn't "certified" it as compatible with SAS web applications. It's best to check third-party product details on for updates and limitations.

Initially, SAS 9.4 and SAS Enterprise Guide 6.1 were available only in 64-bit versions. That's a limitation for some customers who aren't yet ready for "64-bit Everything", so SAS will release 32-bit versions of these products near the end of this year. Here's SAS Note 51786 with the details.

To explore further combinations of operating systems and SAS products, visit the interactive Supported Operating Systems page.

Related articles

SAS Enterprise Guide versions and SAS 9.4: what's supported
How old is your version of SAS?

Post a Comment

5 keyboard shortcuts in SAS that will change your life

Okay, given the title of this article, I might be overselling the content a bit. Read on to see if your life will be transformed.

I've just returned from SESUG, and this question came up during one of the presentations. While this tip might seem basic, it was news to many of the people in the room, so I think it's worth sharing here.

The "old school" style of SAS programming uses upper case for all SAS keywords. But, unlike many newer programming languages, the SAS language is not case-sensitive. These three programs are exactly the same in the eyes of SAS:

/* This is your grandpa's SAS program */
/* Before the shift key was invented  */
/* lower case, unassuming */
proc means data=sashelp.class
  mean mode;
/* Headline-style initial caps */
Proc Means Data=Sashelp.Class
  Mean Mode;

Even though the uppercase style is old-fashioned, it's the style used by SAS Enterprise Guide tasks. When users copy and paste these generated programs into their own code projects, they often want to convert the keywords to lowercase. Do you need to retype the program? Of course not. Any program editor worth its salt offers basic reformatting features, and the SAS program editor is no exception.

Here are 5 examples of quick keystrokes that can save you time.

Convert case
Convert selected text to upper case: Ctrl + Shift + U
Convert selected text to lower case: Ctrl + Shift + L

Quick commenting
Wrap selection (or current line) in a comment: Ctrl + /
Unwrap selection (or current line) from a comment: Ctrl + Shift + /

Hyperjump to a line
Go to line: Ctrl + G (prompts for a line number)
Be sure to turn on Show line numbers in Program->Editor Options, or you'll be navigating blind!

Jump to the "end of scope"
Move caret to matching parenthesis/brace: Ctrl + [, Ctrl + ]
Move caret to matching DO/END keyword: Alt + [, Alt + ]

There are dozens of other shortcut keys and commands available in the program editor. In SAS Enterprise Guide, you can view them all by selecting Program->Enhanced Editor Keys. From this window you can customize the keypress behaviors to your liking, and you can also combine commands and assign them to new keys. If you're a "keyboard person", then you'll want to visit this window and explore. Be sure to click "Include commands without key assignments" to see the full list of available commands.

There are many useful commands that don't have a default key mapping. For example, one very handy command is "Sort selected lines", which will resequence the selected lines into ascending alphabetical order. If that's an operation you need to do frequently, click the Assign Keys button to pick a key sequence as a shortcut.

You can also combine commands with editor macros. For example, would you like a keystroke that can add a date/time stamp as a program comment? Simple! Select Program->Editor Macros->Macros. Then click Create... to build a new sequence. Add the commands you want to automate, and then finish by assigning the new macro to a key sequence. To insert a commented date/time stamp, these are the commands to add:

  • Insert current date and time
  • Comment the selected lines with line comment

Here's what the window looks like:

I assigned Alt+Shift+D to this command, and when I press it just now I get:

While my instructions here apply to SAS Enterprise Guide, this is all available in SAS for Windows when you use the Enhanced Editor. There are several conference papers on the topic: here's one from Art Carpenter.

See also

Take SAS program editor abbreviations to the next level
Hope for ugly programs: the SAS Enterprise Guide code formatter

Post a Comment