Suppose you wish to select a random sample from a large SAS dataset. No problem. The PROC SURVEYSELECT step below randomly selects a 2 percent sample: proc surveyselect data=large out=sample method=srs /* simple random sample */ n=1000000; /* sample size */ run; Do you have a SAS/STAT license? If not,
Tag: tips and tricks
I think everyone can agree that being able to debug programs is an important skill for SAS programmers. That’s why Susan Slaughter and I devoted a whole chapter to it in The Little SAS® Book. I don’t know about you, but I think figuring out what’s wrong with my program
This post is the third and final in a series that illustrates three different solutions to "flattening" hierarchical data. Don't forget to catch up with Part 1 and Part 2. Solution 2, from my previous post, created one observation per header record, with detail data in a wide format, like
This post is the second in a series that illustrates three different solutions to "flattening" hierarchical data. Solution 1, from my previous post, created one observation per header record, summarizing the detail data with a COUNT variable, like this: Summary Approach: One observation per header record Obs Family Count
A family and its members represent a simple hierarchy. For example, the Jones family has four members: A text file might represent this hierarchy with family records followed by family members' records, like this: The PROC FORMAT step below defines the codes in Column 1: proc format; value $type
I'm gearing up to teach the next "DS2 Programming Essentials with Hadoop" class, and thinking about Warp Speed DATA Steps with DS2 where I first demonstrated parallel processing using threads in base SAS. But how about DATA step processing at maximum warp? For that, we'll need a massively parallel processing
Have you ever waited a bit for SAS Enterprise Guide to display the Output Data tab when submitting a SAS program that generates multiple output tables? Or, perhaps your program only generates one big output table but it takes a little while for it to surface on the Output Data
When reading a text file (common extensions: TXT, DAT; or, for the adventurous: HTML) with the DATA STEP, you should always view several lines from the text file, and compare to the record layout, before completing the INPUT statement. There are many ways to view a text file. I use
Default PROC FREQ output looks like this: Suppose you don't want the two cumulative statistic columns above. No problem. Those can be suppressed with the NOCUM option on the TABLE statement, like this: proc freq data=sashelp.shoes; table product / nocum; run;
I recently taught a SAS training course where the students were very engaged. They had so many questions, I could have spent the next month writing helpful blog posts that came from that one class. However, I picked this one question that the class begged for me to share. The
SAS software is used around the world in some of the most sophisticated ways, like ATM fraud detection and cancer research. But recently, I used it for a practical, and much needed, task -- replacing our break room coffee machine. Now, this is no ordinary coffee machine. It also makes
Dataset too big for PROC PRINT? One weird trick solves your problem! proc print data=bigdata (obs=10); run; The OBS= dataset option specifies the last observation to process from an input dataset. In the above example, regardless of dataset size, only the first 10 observations are printed; an easy way to
I remember the first time I was faced with the challenge of parallelizing a DATA step process. It was 2001 and SAS V8.1 was shiny and new. We were processing very large data sets, and the computations performed on each record were quite complex. The processing was crawling along on
With any software program, there are always new tips and tricks to learn, and nobody can know them all. Sometimes I even pick up tips or techniques from my students while they’re learning broader programming tips from me. Like fine wine, instructors only get better with age. Every customer interaction
With Pi Day coming up on 3/14, I wanted to make sure all you SAS programmers know how to use the pi constant in your SAS code... All you have to do is use constant("pi") in a data step, and you've got the value of pi out to a good many decimal places
While perusing the SAS 9.4 DS2 documentation, I ran across the section on the HTTP package. This intrigued me because, as DS2 has no text file handling statements I assumed all hope of leveraging Internet-based APIs was lost. But even a Jedi is wrong now and then! And what better
A student in a SAS class recently asked if there were a way to eliminate data error notes from the SAS log and, instead, write them to a separate file. Of course there's a way! Here's a simple datastep. Notice the missing dollar sign to indicate the variable GENDER (M,
A student brought in this coding problem after her manager was struggling with this issue for a while. They played guessing games, but to no avail. Here’s what happened when they submitted data step and proc sql code using a WHERE clause with an INPUT function? data aileen; length hcn
When teaching statistics, it is often useful to produce a normal density plot with shading under the curve. For example, consider a one-sided hypothesis test. An alpha value of .05 would correspond to a Z-score cutoff of 1.645. This means that 95% of a standard normal curve falls below a
This SAS tutorial video will show you how to generate plots for two continuous numeric variables with Base SAS. Basic scatter plots, linear or curvilinear regression lines, confidence intervals or ellipses, and multiple plot overlays are demonstrated. To learn more about this topic, check out our SAS Programming 1: Essentials
In this tutorial video, you will learn to print a simple listing with Base SAS. You see how to write a PRINT procedure step to display a SAS data set. You also see how to use statements and options to subset observations and variables and enhance the report. Learn
These two tutorial videos will show you how to filter and sort data in Base SAS. In this first video, you will learn to use a WHERE statement in Base SAS to filter or subset SAS data. Data sets can be very large and filtering data enables you to select
Michele Ensor recently posted a wonderful blog with a graph of the 2014 Winter Olympics medal count. I'm going to further refine that graph, making it an Olympic graph ... on steroids! :) Here is Michele's graph: First, let's give it a few simple cosmetic changes. I always like to have
Sure, you have a great looking table and you produce it with PROC TABULATE. And then, bam! Your boss comes along and decides that since your output looks so good in Word, that he’d like that boilerplate paragraph inserted automatically. Currently, you produce the tables and then pass the RTF
“Dear Cat, In a repeated measures drug study, I am unsure what to do with the baseline measurement. Since it is one of the time points in my study, I feel like I should use it as one of the dependent variable measurements. But I have seen analyses where baseline
SAS 9.4 allows you to create html5 output with your graph inline (as part of the html), providing a great way to email your SAS/Graph output! Previously, if you used ods html and dev=png to create graphs, you had to deal with two files -- a png file (containing the graph)
ODS graph styles provide users with an easy way to control things such as the colors and fonts used in a graph, freeing the user from having to specify these properties in their code. A lot of thought was given to picking colors that work well together, and look good. The
To say that I'm excited about the SAS 9.4 release is an understatement! For example, did you know that in SAS 9.4, you can write SAS/Graph output directly to a Powerpoint slide?!? This is definitely an item that was on my "wish list," and will no doubt make life a
Over the holidays I was having a discussion with my cat, Ms. Trixie Lou. A question that often arises during the first programming class is the following: how do I find the variables that are in common to these two or three data sets? As it turns out, Ms. Trixie
When working with "big data" you usually have too many points to view in a plot, and end up subsetting or summarizing the data. But now, in SAS 9.3, you have an alternative! For example, the following scatter plot of 10,000+ points is just a visual "blob": But using a new