How to mimic the N function for character variables using Data Step, PROC SQL

4

Many of you know Mike Zdeb. He's a long-time SAS user and frequent presenter at SAS conferences. Zdeb is also a reviewer of many SAS Press books and author of his own SAS book, Maps Made Easy Using SAS. Zdeb contacted me after he read the May SAS Tech Report to say that he had an idea for reducing the number of steps in one of the SAS Samples in the Tips & Techniques section.

Before I give you Zdeb's emailed tip, I'll preface it with a slight caution: While this tip may be a great step saver, not everyone with Data Step experience has experience with PROC SQL – and vice versa.

If you have tips for reducing the steps to solve a problem OR you have a new way to solve an old problem, send them to me. We'll tag them tips & techniques so that everyone can find them.

With no further ado, here is Zdeb's cool tip:

Hi ... I saw this "How to Mimic the N Function for Character Variables" and looked at the SAS code ...

data temp;
  input (disp1-disp8) ($);
datalines;
1 2 . 4 5 6 7 8
. 2 3 4 5 . 7 8
1 2 3 4 5 6 7 8
1 2 3 . . 6 . 8
1 2 3 4 5 6 7 8
1 . 3 . 5 . 7 .
;
run;                                

proc contents data=temp out=mytemp;
run;                                

data _null_;
  set mytemp nobs=total;
  call symputx('totvar',total);
run;                                

data two;
  set temp;
  obs=&totvar;
  miss=cmiss(of disp1-disp8);
  nonmiss=&totvar-miss;
run;
proc print;
run;

So how about a shorter, more general version that gets right at the number of character variables in a dataset with one use of PROC SQL rather than with PROC CONTENTS and a data step?

It counts [only]the number of character variables since the above will not work if there are any numeric variables in the dataset [&totvar would be a count of all variables, not just character and CMISS isn't limited to character data if you tried some like CMISS (OF _ALL_)]

proc sql noprint;
select count(*) into :charvars from dictionary.columns where libname eq 'WORK' and memname eq 'TEMP' and type eq 'char'; quit;

data two;
set temp;
miss=cmiss(of _character_);
nonmiss=&charvars-miss;
run;

--

Mike Zdeb

So, here's the challenge: First, in the comments section, tell me if you know of other ways to shorten the steps in the SAS SAMPLE (already quite simple); second, send me your blog post (no more than 500 words) highlighting your super cool SAS tips & techniques.


These code examples are provided as is, without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.

Share

About Author

Waynette Tubbs

Editor, Marketing Editorial

Waynette Tubbs is a seasoned technology journalist specializing in interviewing and writing about how leaders leverage advanced and emerging analytical technologies to transform their B2B and B2C organizations. In her current role, she works closely with global marketing organizations to generate content about artificial intelligence (AI), generative AI, intelligent automation, cybersecurity, data management, and marketing automation. Waynette has a master’s degree in journalism and mass communications from UNC Chapel Hill.

4 Comments

  1. proc sort data=sashelp.vcolumn out=temp;
    by libname;
    where libname eq 'WORK' and memname eq 'TEMP' and type eq 'char';
    run;
    
    proc sql noprint;
       select count(libname) into: N from temp;
    quit;
    %put &N;

  2. Collin Elliot on

    This array solution skips the need to query the metadata, no?

    data two;
        set temp;
        array charvars{*} $ _character_;
        miss=cmiss(of _character_);
        altmiss = cmiss(of charvars(*)); /* <-- Alt approach for counting missing */
        nonmiss=dim(charvars) - miss;
    run;

  3. I can make the datastep only part slightly smaller. sashelp.vcolumn I think is the same table that dictionary columns accesses. and sysnobs is an automatic variable new in 9.3

    data vars;
    set sashelp.vcolumn;
    where libname eq 'WORK' and memname eq 'TEMP' and type eq 'char';
    run; 
    
    data two;
      set temp;
      obs=&sysnobs.;
      miss=cmiss(of disp1-disp8);
      nonmiss=obs-miss;
    run;

Leave A Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to Top