How to mimic the N function for character variables using Data Step, PROC SQL

4

Many of you know Mike Zdeb. He's a long-time SAS user and frequent presenter at SAS conferences. Zdeb is also a reviewer of many SAS Press books and author of his own SAS book, Maps Made Easy Using SAS. Zdeb contacted me after he read the May SAS Tech Report to say that he had an idea for reducing the number of steps in one of the SAS Samples in the Tips & Techniques section.

Before I give you Zdeb's emailed tip, I'll preface it with a slight caution: While this tip may be a great step saver, not everyone with Data Step experience has experience with PROC SQL – and vice versa.

If you have tips for reducing the steps to solve a problem OR you have a new way to solve an old problem, send them to me. We'll tag them tips & techniques so that everyone can find them.

With no further ado, here is Zdeb's cool tip:

Hi ... I saw this "How to Mimic the N Function for Character Variables" and looked at the SAS code ...

data temp;
  input (disp1-disp8) ($);
datalines;
1 2 . 4 5 6 7 8
. 2 3 4 5 . 7 8
1 2 3 4 5 6 7 8
1 2 3 . . 6 . 8
1 2 3 4 5 6 7 8
1 . 3 . 5 . 7 .
;
run;                                

proc contents data=temp out=mytemp;
run;                                

data _null_;
  set mytemp nobs=total;
  call symputx('totvar',total);
run;                                

data two;
  set temp;
  obs=&totvar;
  miss=cmiss(of disp1-disp8);
  nonmiss=&totvar-miss;
run;
proc print;
run;

So how about a shorter, more general version that gets right at the number of character variables in a dataset with one use of PROC SQL rather than with PROC CONTENTS and a data step?

It counts [only]the number of character variables since the above will not work if there are any numeric variables in the dataset [&totvar would be a count of all variables, not just character and CMISS isn't limited to character data if you tried some like CMISS (OF _ALL_)]

proc sql noprint;
select count(*) into :charvars from dictionary.columns where libname eq 'WORK' and memname eq 'TEMP' and type eq 'char'; quit;

data two;
set temp;
miss=cmiss(of _character_);
nonmiss=&charvars-miss;
run;

--

Mike Zdeb

So, here's the challenge: First, in the comments section, tell me if you know of other ways to shorten the steps in the SAS SAMPLE (already quite simple); second, send me your blog post (no more than 500 words) highlighting your super cool SAS tips & techniques.


These code examples are provided as is, without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.

Share

About Author

Waynette Tubbs

Editor, Marketing Editorial

+Waynette Tubbs is the Editor of the Risk Management Knowledge Exchange at SAS, Managing Editor of sascom Magazine and Editor of the SAS Tech Report. Tubbs has developed a comprehensive portfolio of strategic business and marketing communications during her career spanning 15 years of magazine, marketing and agency work.

Related Posts

4 Comments

  1. I can make the datastep only part slightly smaller. sashelp.vcolumn I think is the same table that dictionary columns accesses. and sysnobs is an automatic variable new in 9.3

    data vars;
    set sashelp.vcolumn;
    where libname eq 'WORK' and memname eq 'TEMP' and type eq 'char';
    run; 
    
    data two;
      set temp;
      obs=&sysnobs.;
      miss=cmiss(of disp1-disp8);
      nonmiss=obs-miss;
    run;

  2. Collin Elliot on

    This array solution skips the need to query the metadata, no?

    data two;
        set temp;
        array charvars{*} $ _character_;
        miss=cmiss(of _character_);
        altmiss = cmiss(of charvars(*)); /* <-- Alt approach for counting missing */
        nonmiss=dim(charvars) - miss;
    run;

  3. proc sort data=sashelp.vcolumn out=temp;
    by libname;
    where libname eq 'WORK' and memname eq 'TEMP' and type eq 'char';
    run;
    
    proc sql noprint;
       select count(libname) into: N from temp;
    quit;
    %put &N;

Leave A Reply

Back to Top