Query the Windows registry within your SAS program

On the SAS-L mailing list, a participant posed this question (paraphrased):

How can I tell which date format my Windows session is using: European format (with day first) versus USA format (with month first)? I'm reading in output from a Windows file listing, and need to know how to interpret the date value.

There may be more than one way to skin this cat, but one of the responses pointed at a Windows registry key as one cue to discover the "short date format", which can differ depending upon your regional settings. Following that path, the question then becomes: how can I query the Windows registry key from within my SAS program?

Answer: There is a REG QUERY command that is built into Windows. If you can run that command from within your program, then you can parse the result and get your answer.

Here's an example to discover the "short date" format on your Windows system:

/* Requires XCMD privileges */
filename reg pipe
  'reg query "HKEY_CURRENT_USER\Control Panel\International" /v sShortDate';
 
data _null_;
  infile reg dsd;
  input;
  if (find(_infile_,'sShortDate')>0) then
    do;
      result = scan(_infile_,-1,' ');
      call symput('SHORTDATE',result);
    end;
run;
 
%put Short date format is &shortdate.;

Result (with my United States regional settings):

Short date format is M/d/yyyy

The discovery of the REG QUERY command got some of the SAS-L participants kind of excited (you know, it doesn't take much...), and they began to wonder what other goodies could be gleaned from the Windows registry. One key of interest is the infamous TypeGuessRows key, which affects the behavior of PROC IMPORT with Microsoft Excel as it determines column types.

There are some tricky parts to deciphering this key. First, the relevant key was in one location in SAS 9.1 and earlier (which uses Microsoft Jet database engine), and in a different location in SAS 9.2m2 and later (using Microsoft ACE).

The second complication is that on 64-bit systems, the Windows registry is segmented into a 32-bit registry and a 64-bit registry. (That's yet another "gotcha" of 64-bit applications on Windows.) You must query the correct key for the process that will read the Excel file. If you're using 64-bit SAS to read Excel files using DBMS=EXCEL or the EXCEL library engine, query the 64-bit key. If you're using 64-bit SAS to read Excel files using DBMS=EXCELCS or the PCFILES library engine (thus using the 32-bit PC Files Server), then query the 32-bit key.

From a 64-bit process, you can query the 32-bit registry. The 32-bit key "address" will have a Wow6432Node level in the middle of it.

Here's an example of querying the 32-bit key from a 32-bit SAS session (good for SAS 9.1 and earlier):

%let key = HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel;
/* Requires XCMD privileges */
filename reg pipe
  "reg query ""&key"" /v TypeGuessRows";
 
data _null_;
  infile reg dsd;
  length result 8;
  format result 4.;
  input;
  if (find(_infile_,'TypeGuessRows')>0) then
    do;
      /* Value stored as hexadecimal */
      result = input(trim(scan(_infile_,-1,'x')),hex2.);
      call symput('TypeGuessRows_JET',result);
    end;
run;
 
%put Excel (Jet) TypeGuessRows is %trim(&TypeGuessRows_JET.);

Typical result:

Excel (Jet) TypeGuessRows is 8

For those running the 64-bit version of SAS 9.2M2 or later (including SAS 9.3), this example is probably more relevant:

%let key = HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel;
/* Requires XCMD privileges */
filename reg pipe
  "reg query ""&key"" /v TypeGuessRows";
 
data _null_;
  infile reg dsd;
  length result 8;
  format result 4.;
  input;
  if (find(_infile_,'TypeGuessRows')>0) then
    do;
      result = input(trim(scan(_infile_,-1,'x')),hex2.);
      call symput('TypeGuessRows_ACE',result);
    end;
run;
 
%put Excel (ACE) TypeGuessRows is %trim(&TypeGuessRows_ACE.);

And here's the typical result:

Excel (ACE) TypeGuessRows is 8

The key value in the code is very long and might be difficult to see in this code sample, so I'll repeat it here in bold: HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel.

Querying the registry is fun, interesting (if you like that sort of thing), and mostly harmless. Modifying the registry is a little bit risky. There is a REG ADD command that allows you to modify existing registry entries if you have the privileges to do so. That's I'll say about that -- if you're brave and ingenious, you can take it from here.

Post a Comment

Using PROC SQL to get the schema of a MySQL database

It's Friday, and on SAS Voices they are posting fun stuff about dogs who work at SAS. I'm posting about PROC SQL and MySQL. You tell me - which of us knows how to ring in the weekend with style?

I've been working with MySQL data sources lately, and SAS/ACCESS to MySQL makes it easy to do my work in SAS. I can assign a SAS library and then work with the tables and columns almost exactly as if they lived in native SAS data sets. The database columns are automatically visible to my SAS program with SAS-style attributes, such as SAS formats and lengths.

But when I get started with a new database that I'm not familiar with, I'd like to know how MySQL sees that data, including the database-centric attributes. That information is available in the information_schema structure within the MySQL instance. (Note: other databases support the information_schema, so this program can go beyond just MySQL.)

/* must be run where SAS/ACCESS to MySQL available */
%let database=dbname; /* name of the MySQL database */
%let server = mysql.host.domain.com; /* your MySQL node */
 
proc sql;
  connect to mysql as source 
    (
    DATABASE=&database 
    SERVER=&server 
    PORT=3306 /* default port, change if needed */
    USER=your_user_id
    PASSWORD=secretPW
    );
  create table work.schematables as 
    select * from connection to source 
    (select * from 
      information_schema.tables 
    where table_schema="&database");
 
  create table work.schemacolumns as 
    select * from connection to source 
    (select * from 
      information_schema.columns 
    where table_schema="&database");
  disconnect from source;
 
quit;

After plugging my database information and credentials into the above program, and then running it, I receive two output tables. One table (WORK.SCHEMATABLES) shows me all of the database table attributes, such as the table row count, create/update times, index length, and more. I find the other table (WORK.SCHEMACOLUMNS) to be more interesting. It shows me the column names, intrinsic type (bigint, varchar, datetime, and so on), length, whether it's NULLible, and more.

Oh, and if you're a bit queasy about putting your database credentials into a SAS program like this, check out these five strategies to eliminate passwords from your SAS programs.

Post a Comment

How much time will your process flow take to run?

SAS users, by definition, do not embrace the mysterious.

That's one of the main reasons that they use SAS: to demystify some data or process. And so, when you (as a SAS user) have gone to the trouble of designing a process flow in SAS Enterprise Guide, you like to be aware of some basic metrics, such as "how long will it take to run?"

It's difficult to predict how long a SAS process will take to run, as it depends more upon the data than on the actual program instructions. But one thing that we're very good at is telling you how long it took to run the last time that you ran it. In SAS Enterprise Guide, you can find this information at the task level by right-clicking on the task (or program node) and selecting Properties. On the General tab, you'll see the "Last execution time".

How long did the task take?

If you want to gather this information at a process flow or project level, you can repeat these steps for each item in the flow, make a note of the "execution time", then add up the numbers (expressed in hours, minutes, and seconds) to create a grand total. This tedious assignment makes for a perfect torture device for a summer intern who, in this economy, should be grateful to have a job at all.

Or, to make the job less tedious, you could use the Project Reviewer task. This is a custom task (available for download here) that shows a summary view of your process flows and allows you to create a report from the information. The task works with SAS Enterprise Guide 4.3 and 5.1.

Project Reviewer screenshot

Features of the Project Reviewer task include:
1. A selection list with all of the process flows within your project.

2. A list of each "runnable" task; that is, program, task, query, export step, etc. Each task has an "ordinal" (its sequence in the process flow), a name, a descriptive type, the user ID of the last person to modify it, the running time for its most recent iteration, the date/time modified, the date/time created, and whether the task generated errors ("red X" in the flow).

You can sort the items in the list by clicking on the column header for the value you want to sort. Click on the column header again to reverse the sort sequence.

3. A summary of the task count, and total clock time that "running" the flow represents.

4. A Create Report button, which generates a SAS program to produce a simple report of all of the project contents, summarized by each process flow. If you have multiple SAS environments, you can use the Report server list to select which SAS server to use when processing the report.

Here's a sample of the report output:
Sample project reviewer report
The reporting process also generates a SAS program and a data set (which are added to your project), so that you can easily adapt these for custom reports.

Let me know if you find this task to be useful and whether you have any improvements to suggest.

Some final notes/links:

Post a Comment

Using Windows PowerShell to view your SAS data dictionary

In a previous post I showed how you can use Windows PowerShell (with the SAS Local Data Provider) to create a SAS data set viewer.  This approach doesn't require that you have SAS installed, and allows you to read or export the records within a SAS data set file.

In this post, I'll present two companion scripts that allow you to:

  • Gather the table-level metadata about a SAS data set, including the observation count, created and last-modified dates, data set label, and more.
  • Gather the column-level metadata within each SAS data set, including variable names, types, formats, lengths, and more.

If you make use of the SAS DICTIONARY tables (as seen in SASHELP.VMEMBER and SASHELP.VCOLUMN), these scripts will provide familiar information.  But like my previous example, these scripts do not require a SAS installation.

Why is this useful? Of course, the best way to read SAS data sets is to use SAS.  And if you have SAS data sets, the probability is high that you have SAS installed somewhere, so why not use it?  It turns out that even among companies that use SAS, not every employee has access to a SAS environment.  (Tragic, right?)  And since SAS data sets are often treated as a corporate asset (or, at least, the information within the data sets is), these are subject to cataloging and auditing by staff who don't use SAS.  These scripts can enable a light-weight auditing process with a minimum of installation/licensing complications.

Here are links to all three scripts.  To use them, save each file to your local PC as a .PS1 file.  You will also need to make sure that you can run Windows PowerShell scripts, and that you have the SAS OLE DB Local Data Provider installed (free to download).

The output of each of these scripts is in the form of PowerShell objects, which are most useful when piped into another PowerShell cmdlet such as Out-GridView (for visual display) or Export-CSV (for use as input to Excel or another data-driven process).

Examples (as run from a PowerShell console window)

To view the table information about all SAS data sets in a file path w:/ (including subfolders):

.\ReadSasDataTables.ps1 w:\ | Out-GridView

Result (click on the image to see the full-size version):

To export the SAS table information to a CSV file:

.\ReadSasDataTables.ps1 w:\ | Export-CSV -NoType -Path c:\report\tables.csv

Result:
"FileName","Path","FileTime","FileSize","TableName","Label","Created","Modified","LogicalRecords","PhysicalRecords","RecordLength","Compressed","Indexed","Type","Encoding","WindowsCodepage"
"users.sas7bdat","w:\","5/8/2012 9:37:03 AM","466944","users","","5/8/2012 9:37:03 AM","5/8/2012 9:37:03 AM","412","412","952","NO","False","","20","65001"
"bloglist.sas7bdat","w:\","5/8/2012 9:37:03 AM","73728","bloglist","","5/8/2012 9:37:03 AM","5/8/2012 9:37:03 AM","28","28","1360","NO","False","","20","65001"
"posts.sas7bdat","w:\","5/8/2012 9:37:09 AM","103555072","posts","","5/8/2012 9:37:07 AM","5/8/2012 9:37:07 AM","41077","41077","2496","NO","False","","20","65001"
"postviews.sas7bdat","w:\","5/8/2012 9:37:09 AM","5120000","postviews","","5/8/2012 9:37:09 AM","5/8/2012 9:37:09 AM","4808","4808","1040","NO","False","","20","65001"
"comments.sas7bdat","w:\","5/8/2012 9:37:12 AM","26943488","comments","","5/8/2012 9:37:11 AM","5/8/2012 9:37:11 AM","7807","7807","3432","NO","False","","20","65001"
"published_posts.sas7bdat","w:\","5/8/2012 9:37:13 AM","11739136","published_posts","","5/8/2012 9:37:13 AM","5/8/2012 9:37:13 AM","4628","4628","2512","NO","False","","20","65001"
"blogsocial.sas7bdat","w:\","5/6/2012 1:19:06 PM","401408","blogsocial","","5/6/2012 1:19:06 PM","5/6/2012 1:19:06 PM","682","682","544","NO","False","","20","65001"

To view the columns (variables) information for all SAS data sets within a folder w:/ (including subfolders):
.\ReadSasDataColumns.ps1 w:\ | Out-GridView

Result (click on the image to see the full-size version):

To export the SAS columns information to a CSV file:

.\ReadSasDataColumns.ps1 w:\ | Export-CSV -NoType -Path c:\report\columns.csv

Result (partial):
"File name","Column","Label","Pos","Type","Length","Format","Informat","Indexed","Path","File time","File size"
"users.sas7bdat","ID","ID","1","NUM","0","","","False","w:\","5/8/2012 9:37:03 AM","466944"
"users.sas7bdat","user_login","user_login","2","130","90","$180.","$180.","False","w:\","5/8/2012 9:37:03 AM","466944"
"users.sas7bdat","user_registered","user_registered","3","NUM","0","DATETIME19.","DATETIME19.","False","w:\","5/8/2012 9:37:03 AM","466944"
"users.sas7bdat","display_name","display_name","4","130","375","$750.","$750.","False","w:\","5/8/2012 9:37:03 AM","466944"
"bloglist.sas7bdat","blog_id","blog_id","1","NUM","0","","","False","w:\","5/8/2012 9:37:03 AM","73728"
"bloglist.sas7bdat","name","option_value","2","130","512","$1024.","$1024.","False","w:\","5/8/2012 9:37:03 AM","73728"
"bloglist.sas7bdat","path","path","3","130","150","$300.","$300.","False","w:\","5/8/2012 9:37:03 AM","73728"
"bloglist.sas7bdat","registered","registered","4","NUM","0","DATETIME19.","DATETIME19.","False","w:\","5/8/2012 9:37:03 AM","73728"
"bloglist.sas7bdat","last_updated","last_updated","5","NUM","0","DATETIME19.","DATETIME19.","False","w:\","5/8/2012 9:37:03 AM","73728"
"bloglist.sas7bdat","public","public","6","NUM","0","","","False","w:\","5/8/2012 9:37:03 AM","73728"

 

Post a Comment

The top gotchas when moving to 64-bit SAS for Windows

Many SAS customers are quickly adopting 64-bit versions of Microsoft Windows, and they are pleased-as-punch when they find a 64-bit version of SAS to run on it. They waste no time in deploying the new version, only to find that a few things don't work quite the same as they did with the 32-bit version. This post describes the top snags that end users encounter, and how to work around them.

Gotcha #1: Importing Microsoft Excel files

Imagine you have a program that looks like this:

proc import out=work.class
 datafile="c:\temp\class.xls"
 DBMS = EXCEL;
run;

On 64-bit SAS for Windows, you might be surprised to encounter this error:

ERROR: Connect: Class not registered
ERROR: Error in the LIBNAME statement
Connection Failed.  See log for details.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE IMPORT used (Total process time):
  real time           0.11 seconds
  cpu time            0.04 seconds

This isn't limited to importing Excel files. It can happen when you use PROC EXPORT to export Excel files, or use DBMS=ACCESS for Microsoft Access database files, or when you try to use LIBNAME EXCEL to reference a local Excel spreadsheet as data.

The Cause:
Your 64-bit SAS process cannot use the built-in data providers for Microsoft Excel or Microsoft Access, which are usually 32-bit modules. In a previous blog post, I've provided a bit of explanation about this limitation.

The Fix:
Use DBMS=EXCELCS for Excel files, or DBMS=ACCESSCS for Microsoft Access. For LIBNAME access, try LIBNAME PCFILES. These approaches use the PC Files Server, which is a separate small application that is provided with SAS/ACCESS to PC Files. Note that you may need to go back and install this application, as it might not have been placed in your installation automatically. However, you can use the Autostart feature to skip having to configure it as a service, and thus minimize the changes to your SAS programs.

Alternatively, you can try DBMS=XLSX to remove the data providers from the equation.

NOTE: There are a few feature differences between the EXCELCS and EXCEL options. Read this SAS note to determine whether these differences will affect your work.

A Caution:
I've heard of a few customers who decide to workaround this limitation by installing the 64-bit version of Microsoft Office (and thus using the 64-bit data providers). That works, but it might introduce other incompatibilities with how you use your Microsoft Office applications. Microsoft recommends the 64-bit version of Office in only a few circumstances; consider the implications carefully before you head down this road.

Gotcha #2: Incompatible FORMATS catalog

Suppose that you have a library of user-defined formats that you once created by using PROC FORMAT. User-defined formats are stored in SAS catalogs, which are a sort of SAS-specific file system structure that SAS can access during your session.

If you created and used these user-defined formats with 32-bit SAS, you'll see this message when you try to use them with 64-bit SAS:

15         libname library "c:\datasources\32bit";
NOTE: Libref LIBRARY was successfully assigned as follows:
Engine:        V9
Physical Name: c:\datasources\32bit
16         proc print data=sashelp.air;
17         format date benefit.;
ERROR: File LIBRARY.FORMATS.CATALOG was created for a different
 operating system.
18         run;

The Cause:
For all intents and purposes, the move from 32-bit SAS for Windows to 64-bit SAS for Windows is like a platform change, and SAS catalogs are not portable across platforms. Even though you've just moved from one version of Windows to another, from a SAS perspective these files are different, with different internal structures.

The Fix:
SAS provides the utility procedures CPORT and CIMPORT to allow you to transfer catalog content across different operating environments, and you can certainly take that approach for this scenario.

If you have a mixed environment on your team where some people have 32-bit SAS and others have 64-bit SAS, it might be easier to decompose the format definitions down to data sets (by using PROC FORMAT and the CTLOUT option). You can then easily recreate the formats "on the fly" by using PROC FORMAT and the CTLIN option.

This works well because SAS data sets are compatible between the 32-bit and 64-bit versions of SAS...mostly. That brings us to the last "gotcha".

Gotcha #3: Different data set encoding triggers CEDA

If you use SAS data sets that were created by a 32-bit version of SAS, you can read them without modification in 64-bit SAS. But you might see a message like this:

NOTE: Data file TEST.HMEQ.DATA is in a format that is native to another host,
or the file encoding does not match the session encoding.
Cross Environment Data Access will be used, which might require additional
CPU resources and might reduce performance.

I've written about cross-environment data access before, and how it's a bit of SAS magic that helps with cross-platform compatibility. However, you might not have expected it to kick in when you upgraded to 64-bit SAS on Windows.

The Cause:
SAS data set files are written with an encoding that is specific to the SAS operating environment. In 32-bit SAS on Windows, the encoding is WINDOWS_32. On 64-bit SAS, it's WINDOWS_64. When the data set encoding differs from the native SAS session encoding, CEDA kicks in.

The good news is that in SAS 9.3, the SAS developers "taught" SAS for Windows to bypass the CEDA layer when the only encoding difference is WINDOWS_32 versus WINDOWS_64.

The Fix:
You don't have to do anything about this issue unless you want to update the data sets. And if you have SAS 9.3, you probably won't see this message at all...at least not when the data originates from 32-bit SAS for Windows.

If you decide to convert entire data set libraries to the new native encoding, you can achieve this by using PROC MIGRATE.

Parting bits

I'll finish this post with just a few general points to guide you:

  • 64-bit Windows is pervasive, and is a Good Thing. The 64-bit OS, combined with better hardware and more memory, can help to deliver more throughput.
  • In the not-so-distant future, all apps will eventually become native 64-bit. The incompatibility hiccups of today will be tomorrow's faint memory.
  • But for today, don’t automatically deploy 64-bit app "just because" such a version exists. Make it a deliberate business decision to consider.
  • And if you do go with the 64-bit app, budget the time/resources for 64-bit conversion, if necessary

Related posts about 64-bit topics

Myths about 64-bit computing on Windows
Are 64-bit client applications twice as good as 32-bit applications?
How do I export from SAS to Excel files: Let me count the ways
Should you care about 64-bit applications?

Post a Comment

The makeup of SAS Global Forum

Question: What do John Travolta, Gina Davis, and I all have in common? (I mean, besides the obvious fact that we are all awesome dancers.)

Answer: We have all had makeup applied by artist Roxie Stice.

I was the host for two SAS Tech Talks, which were broadcast via Livestream on Monday (link) and Tuesday (link) from SAS Global Forum. Even though these might seem like informal technical chats and demos, each one was every bit a professional televised program (at least, from a production standpoint). And I don't care who you are -- you do not go onto television (with its unforgiving studio lights and HD cameras) without makeup.

Scott M. gets his beauty treatment

Scott M. gets his beauty treatment

Roxie applied makeup to every person who appeared on camera (via studio or big stage). That includes Joe Theismann, Jim Goodnight, Guy Kawasaki, and me. And dozens of other presenters/hosts/experts. In fact, she got me three times (once for each Tech Talk, and one Live Update to promote the talks and the closing session).

During the webcasts, Roxie sits just off-camera with her makeup satchel (a portable subset of her full makeup station). During a commercial break (yes, we had them), she is ready to pop up and do a bit of touch-up before the broadcast goes back live, usually within 30 seconds.

Roxie is a delightful conversationalist. Especially if you like to talk about your various blemishes and skin oddities, because she notices them all. And she'll point them out -- not in a critical way, but simply as a matter-of-fact. She will professionally cover up any such blemishes, and then provide helpful advice about how you can do the same in your daily life. However, if I actually followed her advice, I'd never get out of the house in the morning.

imagine what it would be like *without* makeup

imagine what it would be like *without* makeup


If you ask (and it doesn't take much prompting), Roxie will happily begin listing out all of the other celebs that she applies makeup to. I can only assume that the next time she makes up Gina Davis or John Travolta, she'll start dropping my name to impress them.

As a full-service consultant, she doesn't just apply the makeup; she removes it too. As soon as the broadcast has ended, she's right there with pre-moistened cloths, ready to go to town. If rubbing a bald head truly brings good luck, Roxie is set for life -- between myself, John Brocklebank, Carl Farrell, and few of the other "follically challenged" participants. (Hey, as they say, grass doesn't grow on a busy street...)

Post a Comment

Myths about 64-bit computing on Windows

When I first joined SAS in 1993, we were gliding into the golden age of 32-bit computing on the PC. Microsoft offered a new extension to Microsoft Windows 3.1 called Win32s, and it allowed 32-bit applications to run on the Windows operating system. SAS Version 6.10 for Windows was one of the early applications to take advantage of this new "platform".

Now 64-bit PCs are the norm, and virtually all new Windows installations use a 64-bit operating system. The 64-bit version of Windows 7 (for example) offers a number of benefits, primarily due to its ability to manage/address a larger amount of system memory. If you have more memory on the system (typical workstation PCs now have 8GB or even 16GB), you can run more processes at the same time. They will run faster because there is less of need to swap information out of memory and onto disk; disk-based I/O is now less of a performance bottleneck.

But for all of the real benefits, some myths about 64-bit computing still persist. I'll attempt to dispel a few of those here.

MYTH: All operations are twice as fast

REALITY: 64-bit architecture does not make your computations faster.
Some people will say that, in theory, it could make computations slower in some cases because the native pointer representation is larger.

WEAK ANALOGY ALERT:
A pointer is an address in memory; think of it like a phone number. In my locale, we recently added a new 3-digit area code to allow for more telephone numbers to be issued. As a result, all residents here must use 10-digit dialing instead of the 7-digit dialing that we're accustomed to. Our phone system has more capacity, but it takes me longer to initiate a phone call.

But in practice you will see things run faster, most likely, because newer 64-bit machines have faster processors with multiple cores, more memory to reduce I/O, and more registers in the CPU to complete operations more efficiently. These are usually paired with faster storage technology and other infrastructure improvements, so you should see faster throughput.

MYTH: My results will be twice as accurate.

REALITY: It's true that you can store larger numbers with more precision in 64 bits than you can in 32 bits.

But even before 64-bit architecture, most programming languages offered intrinsic data types with a 64-bit storage length. This is true in SAS, where the LENGTH of a numeric variable determines its capacity. The default length for a SAS numeric variable is 8 bytes, regardless of platform and bitness. 8 bytes * (8 bits/byte) = 64 bits.

The precision adheres to IEEE standards for such things. If you want all of the excruciating details, check out this blog post about numeric precision in SAS.

POP QUIZ: What's the largest integer that SAS can store in 8 bytes with exact representation?

MYTH: I'll be able to process twice as much data.

REALITY: Okay, there is some truth to this one. A 64-bit process can address more memory, so memory-intensive operations can benefit from this.

However, not all SAS operations are constrained by system memory. If you run lots of DATA step or SQL, your process might be more dependent on I/O performance. As I said in the lead-in for this post, your shiny new 64-bit system might have better I/O (perhaps even a solid-state drive), but that's not directly related to the 64-bit aspect.

In SAS, some memory-constrained operations such as hash tables and PROC REG will enjoy the additional elbow room. Other I/O-bound operations won't benefit directly, but there are things that you can do to tune your system for better SAS performance.

JMP is a SAS-provided application that can exploit a 64-bit system with lots of memory. SAS Enterprise Guide 5.1 has a 64-bit version that provides some benefit, but with trade offs.

If you move from 32-bit SAS to 64-bit SAS on Windows, you might face a few "gotchas" -- many related to compatibility issues with the legacy of 32-bit system. I'll address some of those issues in a future post, but here's a preview: Excel import/export, SAS catalog formats, location of 32-bit versus 64-bit system utilities, and a separation of content in the Windows registry.

Post a Comment

SAS Tech Talks LIVE from SAS Global Forum

For me, this will be a SAS Global Forum of many "firsts". My first Closing Session. My first time attending as a staffer outside of SAS R&D (well, except for the chilly SUGI 21 in Chicago). And my first chance to host the SAS Tech Talk sessions, which will be live-streamed from the Support and Demo Area at the conference!

Actually, it's a first for everybody, since we've never live-streamed this type of content from the conference. It was a bit of a late addition to my conference schedule, but I am very excited about it.

Each SAS Tech Talk will feature conversations with R&D staffers who are involved in creating the hot new tech from SAS. Each guest will show off their work in a demo, and then you can be part of the show as well, by submitting your questions and comments via the Livestream site.

I think we've secured a great lineup for the two shows. Trying to plan when to tune in? Here's the schedule as it stands right now:

GUEST LINEUP

Monday 5pm EDT
High-Performance Data Mining
Jared Dean, Director of SAS Enterprise Miner R&D

Text Analytics and Sentiment Analysis: Case study of AllAnalytics.com
Jim Cox, Senior Manager of SAS Text Miner R&D

SAS/ACCESS to Hadoop
Howard Plemmons, Director of SAS/ACCESS R&D

Tuesday 5pm EDT
SAS Visual Analytics Explorer
Nascif Abousalh-Neto, Manager of SAS BI Visualization R&D

SAS iPad apps – Visual Analytics and more
Scott McQuiggan, Manager of SAS BI Mobile R&D

"From the labs", Microsoft Surface and Kinect show-and-tell
Mike Barnhart, Senior Developer, BI Clients R&D

If you can, tune in LIVE and interact with us! Don't miss it! But if you do miss it, don't worry. The shows will be recorded and available on demand.

Post a Comment

Doing more with SAS Enterprise Guide automation

Have you ever selected File->Schedule Project or Schedule->Process Flow in SAS Enterprise Guide? Are you curious about what magic these actions will trigger?

Here's what happens:

  • SAS Enterprise Guide creates a VBScript program that contains the instructions to start SAS Enterprise Guide, load your project, run your project or flow, and save the results.
  • SAS Enterprise Guide launches the Windows Task Scheduler interface, so that you can specify when and how often to run your project/flow unattended.
  • The scheduled task contains the command (CSCRIPT.EXE) and a reference to the VBScript (VBS file), which Windows will run at the appointed time. (Learn more about this mechanism at "Using SAS Enterprise Guide to run programs in batch".)

The VBScript instructions drive the SAS Enterprise Guide automation API, and that API is capable of much more than simply loading and running your project. And you aren't stuck with VBScript -- you can access the automation API using Windows PowerShell and Microsoft .NET.

As the following diagram shows, the automation interface is a peer to the main user interface for SAS Enterprise Guide. Like the happy gentleman pictured at the top of the diagram, most users will interact with the main windows of SAS Enterprise Guide. Users gain access to this user interface by using the primary executable (SEGuide.exe), often by way of a desktop shortcut icon. With automation,you forgo the SAS Enterprise Guide user interface entirely, and you instead script every action using the automation API.

These are the concepts and examples that are the subject of my SAS Global Forum 2012 paper, Not Just for Scheduling: Doing More with SAS Enterprise Guide Automation. I will present this topic at the conference on Wednesday morning at 8am (with coffee in hand, most likely).

Check out my summary page on sasCommunity.org for links to the paper, blog posts, and several examples. The examples include:

  • A VBScript example that can extract all of the SAS programs and SAS logs from your project file.
  • A PowerShell example to create a simple listing of all of the tasks and input data within your project file.
  • A Microsoft .NET example (implemented with C#) that allows you to search for any text within your project file.

That last one is of special interest (and worthy of a separate blog post later). Even if the automation API isn't your thing, you might enjoy the EGPSearch example, which allows you to search a collection of SAS Enterprise Guide project files for any text within your SAS programs, logs, notes and more.

Post a Comment

Build your own SAS data set viewer using PowerShell

In about 30 lines of PowerShell script, we can build a SAS data set viewer that:

  • Does not require SAS on the PC
  • Provides very basic filtering capability
  • Also allows for easy export to CSV

All you need is the ability to run PowerShell scripts, and the SAS Local Data Provider for OLE DB.

I said that we could do this with about 30 lines of script. But I like to include some comments, error handling, and spacing for readability -- so the script has ballooned to about 120 lines.

Here's an example of how to run the script in PowerShell (put this all on one line):

ReadSasDataset.ps1 'C:\Data\westernrates.sas7bdat' | 
  Out-GridView -Title "Western Rates data"

Here's a screenshot of the result.

Suppose that you want to export the SAS data set to CSV, perhaps for use in Excel? Try this command (again, all on one line):

ReadSasDataset.ps1 'C:\Data\westernrates.sas7bdat' | 
  Export-CSV -Path C:\Data\WesternRates.csv -NoTypeInformation

Windows PowerShell has many built-in "cmdlets" (such as Export-CSV and the Out-GridView interface) that allow you to easily pipe output from one command to another, and shape the result into exactly what you need.

Download the full script: ReadSasDataset.ps1.

(It has a TXT file extension for security purposes. To use it, save the script as a PS1 file on your local machine.)

Here is the meat of the script. The script uses ADO to "connect" to the data set and read the records, one-by-one. For each field within each record, the script creates a new PowerShell psobject, which holds the "name-value pair" of column name and cell value. These are the actual output of the script. To do something useful with those objects, you pipe the results to another cmdlet.

Note: Those little backwards tickmarks that look like dirt specks on your screen are actually used as the PowerShell line continuation character. I used them here to make the script easier to read. Unlike the SAS language, PowerShell statements usually terminate with the end-of-line, not a semicolon.

$objConnection = New-Object -comobject ADODB.Connection
$objRecordset = New-Object -comobject ADODB.Recordset

$objConnection.Open("Provider=SAS.LocalProvider;Data Source=`"$filePath`";")
$objRecordset.ActiveConnection = $objConnection

# tell the SAS provider to return FORMATTED SAS values
$objRecordset.Properties.Item("SAS Formats").Value = "_ALL_"

# open the data set
# IMPORTANT: passing in a "missing" value for the connection
# because the connection is already on the RecordSet object
$objRecordset.Open($filename, [Type]::Missing,
   $adOpenDynamic, `
   $adLockOptimistic, `
   $adCmdTableDirect) 
   
$objRecordset.MoveFirst()

# read all of the records within the SAS data file
do 
{
  # build up a new object with the field values
  $objectRecord = New-Object psobject
  
  for ($i=0; $i -lt $objRecordset.Fields.Count; $i++)
  {
     # add static properties for each record
     $objectRecord | add-member noteproperty `
     -name $objRecordset.Fields.Item($i).Name `
     -value  $objRecordset.Fields.Item($i).Value;     
  }
  # emit the object as output from this script
  $objectRecord
  
  # move on to the next record
  $objRecordset.MoveNext()
} 
until ($objRecordset.EOF -eq $True)

# close all of the connections
$objRecordset.Close()
$objConnection.Close()

In a future post, I'll provide more PowerShell examples that allow you to see all of the data set attributes for a collection of SAS data set files, similar to what you can see with PROC DATASETS.

Post a Comment