How many observations were processed by that last step?

18

A well-formed WHERE statement or subsetting IF can narrow down the output of your SAS DATA step. The SAS log does a good job of telling you how many records were processed by the action. For example, let's look at this simple DATA step with my "poor man's random sample", implemented with the RANUNI function:

15         data cars;
16           set sashelp.cars;
17           if ranuni(0) < 0.85;
18         run;

NOTE: There were 428 observations read from the data set SASHELP.CARS.
NOTE: The data set WORK.CARS has 364 observations and 15 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

The output will also be approximately 85% of the original data, but not exactly. It's random, so the result varies each time that I run it. What if I need to know exactly how many records were processed?

In SAS 9.3, I can now pick that up from the SYSNOBS automatic macro variable. For example:

%put how many = &SYSNOBS;

Yields:

20         %put how many = &SYSNOBS;
how many = 364

This new feature in 9.3 was added as a direct result of a customer suggestion. Keep 'em coming!

Share

About Author

Chris Hemedinger

Director, SAS User Engagement

+Chris Hemedinger is the Director of SAS User Engagement, which includes our SAS Communities and SAS User Groups. Since 1993, Chris has worked for SAS as an author, a software developer, an R&D manager and a consultant. Inexplicably, Chris is still coasting on the limited fame he earned as an author of SAS For Dummies

18 Comments

  1. Thanks to the SAS developers for this. I found three sections in various programs that I no longer need. Do you have any code you can throw away due to &sysnobs?

    data a;
      set sashelp.class;
    run;
    %put table=&amp;syslast;
    %put nobs=&amp;sysnobs;
    
    /* method 1 */
    proc sql noprint; 
    select nobs 
    into :nobs 
    from dictionary.tables 
    where libname=scan("&amp;syslast",1) and memname=scan("&amp;syslast",2) 
    ; quit;
    %put nobs=&amp;nobs;
    
    /* method 2 */
    data _null_; 
    call symput('nobs',nobs); 
    stop; 
    set &amp;syslast nobs=nobs;
    run;
    %put nobs=&amp;nobs;
    
    /* method 3 */
    %let dsid=%sysfunc(open(&amp;syslast));
    %let nobs=%sysfunc(attrn(&amp;dsid,nobs));
    %let rc=%sysfunc(close(&amp;dsid));
    %put nobs=&amp;nobs;
    

  2. Interesting and useful. However, the code doesn't seem to match the description given in the documentation:
    "Contains the number of observations read from the last data set that was closed by the previous procedure or DATA step."

    Contains the number of observations READ. Since the subsetting if statement reads ALL the observations, but subsets how many it outputs, this isn't a correct description.

    I don't have SAS in front of me to test out your code, but assuming it's correct, the documentation needs to be reworded.

    • Chris Hemedinger
      Chris Hemedinger on

      Stephen, thanks for the comment.

      I think in this contrived example of mine, the last data set is WORK.CARS, which was stuffed with 364 observations. And that's what the SYSNOBS number refers to. But I agree that the doc could be clearer, and an example would be helpful. Perhaps this makes it clearer:

      20  %put Data processed: %trim(&amp;SYSLAST) with &amp;SYSNOBS records;
      Data processed: WORK.CARS with 364 records
      

      But if it was made too obvious, we wouldn't need SAS blogs or sites like sas-resources.com, would we?

      • Ha! Thanks for the link :)

        I agree, your example is good and shows exactly what &SYSNOBS is doing. It is the documentation that is ambiguous.

        Thanks again! -s

    • Interesting. I agree, looks like it's storing the number of records in &syslast. And when you output mutliple datasets in a single step, it looks like &syslast is whichever dataset was listed last on the output dataset. Log below.

      I tend to stay away from &syslast and _last_, in preference to explicitly naming datasets, but this is still a nice tip to know about. Thx!

      102  data a (where=(i=1))
      103       b (where=(i IN(1,2)))
      104   ;
      105    do i=1 to 5;
      106      output;
      107    end;
      108  run;
      
      NOTE: The data set WORK.A has 1 observations and 1 variables.
      NOTE: The data set WORK.B has 2 observations and 1 variables.
      
      109
      110
      111  %put Number of records in %trim(&amp;syslast) is: &amp;sysnobs;
      Number of records in WORK.B is: 2
      112
      113  data
      114       b (where=(i IN(1,2)))
      115       a (where=(i=1))
      116   ;
      117    do i=1 to 5;
      118      output;
      119    end;
      120  run;
      
      NOTE: The data set WORK.B has 2 observations and 1 variables.
      NOTE: The data set WORK.A has 1 observations and 1 variables.
      
      121
      122  %put Number of records in %trim(&amp;syslast) is: &amp;sysnobs;
      Number of records in WORK.A is: 1
      

    • Thanks Chris,

      Was really hoping &sysnobs would return the number of records that survived the where clause when called after a PROC, but seems to just return the number of records in the input dataset.

      43 proc print data=sashelp.shoes;
      44 where Region='Canada';
      45 run;

      NOTE: There were 37 observations read from the data set SASHELP.SHOES.
      WHERE Region='Canada';

      46 %put &sysnobs;
      395

      So to get the "right" answer (37), feels like I still need to make another pass of the data. Bummer.

      --Q.

  3. sounds useful &sysNobs

    Please can you clear up why sql produces different values in SYSNOBS for the same select clause
    select * from sashelp.class where ranuni(1)> .9
    depending whether the destination is
    a table(sysnobs=output count = sqlobs)
    or a list(when sysnobs = the input count before the where is applied)

    Here is a clip from my saslog:
    29 proc sql; 30 create table cl as 31 select * from sashelp.class where ranuni(1)&gt; .9 ; NOTE: Table WORK.CL created, with 5 rows and 5 columns. 32 %put &amp;sysnobs; 5 33 /*create table cl as*/ 34 select * from sashelp.class where ranuni(1)&gt; .9 ; 35 %put &amp;sysnobs; 19 36 quit; NOTE: PROCEDURE SQL

    • Chris Hemedinger
      Chris Hemedinger on

      Peter,

      Always you with the trick questions! I'm going to guess that it has to do with the internals of how PROC SQL processes the SELECT without a CREATE TABLE. There is some inner magic perhaps creating an internal view of the original data, but it exists only long enough to produce an ODS output.

      For more precise results, try this:

      proc sql;
      create view clv as 
        select * from sashelp.class where ranuni(1)&gt;.9 ;
      %put &amp;syslast records are &amp;sysnobs;
      quit;
      
      proc sql;
      create table cl as
        select * from sashelp.class where ranuni(1)&gt;.9 ;
      %put &amp;syslast records are &amp;sysnobs;
      quit;
      

      • so VIEWs have a different effect when compiled.

        Aother issue occurs with select into :names

        proc sql noprint ;
          select name into :names separated by ' '
            from sashelp.class where ranuni(1)&gt;.9 ;
        %put (&amp;names) records ar &amp;sysnobs;
        quit;
        

        partial log

        183    select name into :names separated by ' '
        184      from sashelp.class where ranuni(1)&gt;.9 ;
        185  %put (&amp;names) records ar &amp;sysnobs;
        (Mary Philip) records ar 19
        186  quit;
        

        SYSNOBS shows 19 but only two names were created
        198  proc sql noprint ;
        199    select name into :names separated by ' '
        200      from sashelp.class where ranuni(1)&gt;.9 ;
        201  %put &amp;sqlobs (&amp;names) from(?) &amp;sysnobs ;
        5 (Alice Henry James Philip William) from(?) 19
        202    select name into :names separated by ' '
        203      from sashelp.class where ranuni(1)&gt;.9 ;
        204  %put &amp;sqlobs (&amp;names) from(?) &amp;sysnobs ;
        2 (Mary Philip) from(?) 19
        205  quit;
        

        Seems that ranuni(1) in SQL does not always start at the same seed in SQL

  4. Hi,Hemedinger!
    I am Jacob from China ,and I am a SAS programmer.
    I need you help.
    This is the problem:

      
    data m01(index=(a));
    input a b;
    cards;
    1 3
    2 4
    3 5
    ;
    run;
    data m02(index=(a));
    input a c;
    cards;
    1 6
    2 7
    2 8
    ;
    run;
    data m03 m04;
      set m01;
      set m02 key=a;
      select(_iorc_);
        when(%sysrc(_sok)) do;
          output m03;
        end;
        when(%sysrc(_dsenom)) do;
          _error_ = 0;
          output m04;
        end;
        otherwise do;
          _error_ = 0;
          stop;
        end;
      end;
    run;
    proc print data=m03;run;
    

    But the results of dataset m03 is not what I want.
    I need this:
    Obs    a    b    c
     1     1    3    6
     2     2    4    7
     3     2    4    8
    

    I have to deal it with double set, but I can not find the way or some materials, would you help me?
    Thank you!

  5. Hi Chris. Any idea if SYSNOBS will work for non-SAS data, such as DBMS (e.g. Oracle, DB2, Teradata) tables?

    • Chris Hemedinger
      Chris Hemedinger on

      Yes, Marty -- I think so. I tried with an Excel file using the PCFILES engine:

      libname t pcfiles path="c:\temp\myexcel.xlsx";
      data t.new;
        set sashelp.cars;
      run;
      %put &amp;syslast was &amp;sysnobs;
      

      And got:

      T.NEW was 428
      

  6. Wow! About time! I always wondered why there was a SQL version but not for anything else. Unfortunately, I'm going to continue to use the %NOBS macro program, because it works on any SAS install. That is, until no one is using < 9.3.

    • Hi Chris

      Yes you should just continue to use your own %nobs macro for this because you can do thing like this (ie can apply where clause):

      %nobs(dset=sashelp.class(where=(sex='M')));

      So I should not bother with &SYSNOBS

      John

  7. As SAS documentation says "Note: If the number of observations for the data set was not calculated by the previous procedure or DATA step, the value of SYSNOBS is set to -1.",

    What is mean by this ??
    I am thinking that "If Previous Procedure or Datastep failed to execute due to some errors,that time value of SYSOBS is set to -1.",
    am i correct ??

    thanks,
    Sanjeev.K

    • Chris Hemedinger
      Chris Hemedinger on

      Sanjeev, actually I think that the current doc is a bit misleading. This will be fixed soon, but the real behavior is "Contains the number of observations in the last data set created by a procedure
      or DATA step."

      So if a step doesn't create a data set, then the value is unchanged from its previous setting. My speculation: if a step creates a data set with an unknown number of records (such as a database view), then the value might be -1 (which is sort of the SAS convention for "uncounted number of records").

  8. Pingback: How many records are in that PROC SQL result? - The SAS Dummy

Back to Top