A ghoulish Halloween Boo to all my readers! Hope my costume freaks you out, but even if it doesn't, I’m positive PROC FREQ will in a few amazing ways!
Today’s Programming 2: Data Manipulation Techniques class asked about the power of PROC FREQ. Since I stopped to explain some of it's benefits to the class, I thought I'd share some of this knowledge with you as well. So, here are three freaky ways to get this proc to work in your favor.
1) Data validation freak - PROC FREQ helps with data validation
If I don’t know my data and need to filter rows based on some data values, I can use proc freq like this.
proc freq data=sashelp.cars; tables make; run; *once I see data values in the output window I can easily copy & paste into my where clause.; *it helps me type data in my where clause correctly; proc print data=sashelp.cars; where make='Acura'; run; |
2) Unique values freak - PROC FREQ helps writing duplicate observations out
Typically I would rely on the output window. Trying to decipher with my eyes, retain number of duplicates in my mind like this. Isn’t that a waste of time?
Instead take this code and let proc freq create a table of your duplicate values that you can further analyze whenever you want. No more memory recall, no more eye glazing.
proc freq data=sashelp.cars; tables make/out=dups(where=(count > 1)); run; |
3) Efficiency freak - PROC FREQ helps during conditional processing
This is when things get really freaky!
You know its more efficient to check values in order of decreasing frequency. Here is code to help you figure that out.
proc freq data=sashelp.cars order=freq; tables make ; run; |
Now its super easy to stack the Toyota first in your conditional processing algorithm. You didn't guess that this was the largest value, you actually got PROC FREQ to report on count of decreasing frequency with the order=freq option.
data makes; keep make maker; set sashelp.cars; if make='Toyota' then maker='Japanese'; else if make='Hummer' then maker='American'; run; |
So, there you go. Three ways, lots of efficiency learning and a very happy Halloween from this ghoul to you! How else do you use PROC FREQ? Let me know in the comments below.
P.S. Did I scare you with the spider web on my cheek? If you are not scared, what else do I need to do? Until next time!
6 Comments
Thanks for posting
I especially like this one on the dups
thanks Anesh, glad you appreciated the tip.
I also use PROC FREQ for data validation in some different ways. For example, if I am creating a binary for regression analysis, I like to see exactly the scenario(s) that lead to that binary being 1 versus 0 - and whether there are any instances that might lead to unintentional missing values in my binary. In the case of an originating variable that is character, case may also matter in the creation of a binary.
One has a data set incoming in which a variable gender has six values, "M", "m", "F", "f", " " and "U";
A simple freq could inform your binary variable creation, but for the purposes of demonstrating my point I've outlined a few scenarios.
data test;
set incoming;
b_female=(gender="F");
label b_female="Binary: Gender is Female";
if gender ne "U" then b_male=(gender="M");
label b_male="Binary: Gender is Male";
if gender not in("U"," ") then b_male2=(upcase(gender)="M");
label b_male2="Binary: Gender is Male";
run;
proc freq data=test;
tables gender b_female b_male b_male2
b_female*gender
b_male*gender
b_male2*gender
/ missing list;
title2 "check binary creation";
run;
PROC FREQ crosstabs are wonderful tools to inform variable creation!
OH!! i love your fantastic way of combining cross-tabs with Binary variables Louise. Hope to see you soon again
Awesome tips, Charu. I find the ORDER=FREQ option useful in many situations. Two of my own FREQ-y tips include
Create a "Top 10" or "Top 20" list of the most frequent categories.
Use the PLOTS= option to get PROC FREQ to automatically display graphs of the one-way or two-way analyses. For example, PROC FREQ can create graphs of two-way tables.
thanks Rick, Love your "Top 10 list" tip.