One-level data set names in SAS are not always stored in WORK

3

One of the first things SAS programmers learn is that SAS data sets can be specified in two ways. You can use a two-level name such as "sashelp.class" which uses a SAS libref (SASHELP) and a member name (CLASS) to specify the location of the data set. Alternatively, you can use a one-level name such as "TempData," and SAS searches for the data set in a default location.

In many SAS environments, one-level data set names are like the seven little dwarves: Heigh-Ho, heigh-ho, it's off to WORK they go! One-level data set names are like the seven little dwarves: Heigh-Ho, heigh-ho, it's off to WORK they go! #sastip Click To TweetIn other words, the WORK directory is the default location for one-level names. Consequently, one-level names often imply "temporary data," because data sets in WORK are deleted when you exit SAS.

However, it is possible to use the OPTIONS statement to change the libref that SAS searches when you specify a one-level SAS data set name. The option name is USER. The following statements specify a new libref that SAS should use as the default location for one-level data set names:

libname DEFLIB "C:/Temp";    /* define any libref */
options user=DEFLIB;         /* set the default location for one-level names */

For example, the following DATA step uses a one-level name for the data set. Consequently, the data set is created in the USER directory and PROC DATASETS lists the data sets in USER rather than WORK:

data TempData; x=1; y=2; z=3; run;  /* create data set using one-level name */
proc datasets; run;                 /* note that it is in the USER libref! */
Default libref for one-level name

Personally, I never do this because data sets in USER are not deleted when SAS exits. However, this example shows that one-level names are not always stored in WORK.

Discover the default storage location

If a one-level data set name is not necessarily in WORK, can you programmatically discover the libref where the data set is? Yes! The GETOPTION function returns the value for any SAS option, so you can retrieve the value of the USER option. For example, the following DATA step discovers the libref and data set name for a specified data set. For a two-level name, the name contains a period, which you can find by using the FINDC function. You can then use the SUBSTR function to extract the name of the libref and data set. If the data set name is a one-level name, then the GETOPTION function obtains the default libref. (If the USER option is not set, GETOPTION returns a blank string.)

%let MyData = TempData;       /* specify one-level or two-level data set name */
 
data _null_;
dsName = "&MyData";
LocDot = findc(dsName, ".");          /* Does name contain a period (.)?     */
if LocDot > 0 then do;                /*   Yes: it is a two-level name       */
   lib = substr(dsName, 1, LocDot-1); /*     get substring before the period */
   member = substr(dsName, LocDot+1); /*     get substring after the period  */
end;
else do;                              /*   No: it is a one-level name        */
   lib = getoption("user");           /*   Has this option been defined?     */
   if lib = ' ' then lib = "work";    /*     No: use WORK                    */
   member = dsName;
end;
put lib=;
put member=;
run;
lib=DEFLIB
member=TempData

In summary, although one-level data set names are usually stored in WORK, that is not always the case. However, a programmer can use the GETOPTION function to discover the libref where one-level data sets are stored.

An application to SAS/IML programming

The reason I was interested in the GETOPTION function is that I was trying to write a function in SAS/IML that would accept a one- or two-level data set name and return the names of the variables in the data. The CONTENTS function in SAS/IML almost does what I want, but the CONTENTS function has two different signatures, one for two-level names and one for one-level names:

  • For two-level names, use two arguments: varNames = contents(lib, name);
  • For one-level names, use one argument: varNames = contents(name);

I wanted to write a function that accepts a single string (a one-level or two-level data set name) and calls the appropriate signature of the CONTENTS function. The following SAS/IML function does the job:

proc iml;
/* new CONTENTS function that handles one- and two-level data set names */
start ContentsEx( dsName );              /* "Ex" means "extended" */
   LocDot = findc(dsName, ".");          /* Does name contain a period (.)?     */
   if LocDot > 0 then do;                /*   Yes: it is a two-level name       */
      lib = substr(dsName, 1, LocDot-1); /*     get substring before the period */
      member = substr(dsName, LocDot+1); /*     get substring after the period  */
      return( contents(lib, member) );
   end;
   return( contents(dsName) );           /*   No: it is a one-level name        */
finish;
 
dsName = "&MyData";
varNames =  ContentsEx( dsName );
print varNames;
t_deflib2

Have you ever had the need to use the USER option to override the default storage location for one-level data set names? Leave a comment.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

3 Comments

  1. Nice post, thanks. Implicit libref becomes all the more important with the latest 'Big Data' engines like Hadoop or in-memory (LASR) where SASWORK becomes almost irrelevant _ in my opinion. Here is a short macro-code which can be invoked directly to expand conditionnaly the table name. I have only recoded your data step logic into macro :

    /* Tablename Expansion %TE() */

    %MACRO TE(ds) ;
    /* If data set name includes . character (= two-level name) */

    /*then do nothing */
    /* else ( = one-level name ) then */

    /* if option USER is set */

    /* then expand with USER libref : result = . */
    /* else expand with WORK libref : result = WORK. */

    %sysfunc( IFC( %sysfunc(findc(&ds,%STR(.))) GT 0,&ds,%sysfunc(IFC(%sysfunc(getoption(USER)) EQ %STR( ),WORK,%sysfunc(getoption(USER)))).&DS.))
    %MEND;

    /* example */
    DATA %TE(TempData);
    run;

    The best way, maybe, to deal with it would be to add a specific 'expansion operator' like 1. ("1." prefix) or ( ) ("parentheses") for instance.

  2. Chris Hemedinger
    Chris Hemedinger on

    I can think of a case. If you have your data in a high performance data store (such as SAS SPDS), you might want your SAS programs to not move scratch data out of that environment during processing -- a move that can introduce delays due to cross-environment latency. So you might define USER to map to a scratch location in the database environment and allow your one-level names to resolve to that.

    When that part of your processing is complete, you could reset the USER option so that WORK is used again.

  3. Another instance where one level data set names are not in the work library is if you create an SQL view. For instance:
    proc sql;
    create view perm.name as
    select *
    from othername
    where var > 20;

    In this case the othername data set is assumed to be in the same library location as where the view, perm.name, is stored. This allows you to not always have the libref of perm when you use the view.

Leave A Reply

Back to Top