Suppose you wish to select a random sample from a large SAS dataset. No problem. The PROC SURVEYSELECT step below randomly selects a 2 percent sample: proc surveyselect data=large out=sample method=srs /* simple random sample */ n=1000000; /* sample size */ run; Do you have a SAS/STAT license? If not,
Tag: tips and tricks
I think everyone can agree that being able to debug programs is an important skill for SAS programmers. That’s why Susan Slaughter and I devoted a whole chapter to it in The Little SAS® Book. I don’t know about you, but I think figuring out what’s wrong with my program
This post is the third and final in a series that illustrates three different solutions to "flattening" hierarchical data. Don't forget to catch up with Part 1 and Part 2. Solution 2, from my previous post, created one observation per header record, with detail data in a wide format, like
This post is the second in a series that illustrates three different solutions to "flattening" hierarchical data. Solution 1, from my previous post, created one observation per header record, summarizing the detail data with a COUNT variable, like this: Summary Approach: One observation per header record Obs Family Count
A family and its members represent a simple hierarchy. For example, the Jones family has four members: A text file might represent this hierarchy with family records followed by family members' records, like this: The PROC FORMAT step below defines the codes in Column 1: proc format; value $type
I'm gearing up to teach the next "DS2 Programming Essentials with Hadoop" class, and thinking about Warp Speed DATA Steps with DS2 where I first demonstrated parallel processing using threads in base SAS. But how about DATA step processing at maximum warp? For that, we'll need a massively parallel processing
Have you ever waited a bit for SAS Enterprise Guide to display the Output Data tab when submitting a SAS program that generates multiple output tables? Or, perhaps your program only generates one big output table but it takes a little while for it to surface on the Output Data
When reading a text file (common extensions: TXT, DAT; or, for the adventurous: HTML) with the DATA STEP, you should always view several lines from the text file, and compare to the record layout, before completing the INPUT statement. There are many ways to view a text file. I use
Default PROC FREQ output looks like this: Suppose you don't want the two cumulative statistic columns above. No problem. Those can be suppressed with the NOCUM option on the TABLE statement, like this: proc freq data=sashelp.shoes; table product / nocum; run;
I recently taught a SAS training course where the students were very engaged. They had so many questions, I could have spent the next month writing helpful blog posts that came from that one class. However, I picked this one question that the class begged for me to share. The
SAS software is used around the world in some of the most sophisticated ways, like ATM fraud detection and cancer research. But recently, I used it for a practical, and much needed, task -- replacing our break room coffee machine. Now, this is no ordinary coffee machine. It also makes
Dataset too big for PROC PRINT? One weird trick solves your problem! proc print data=bigdata (obs=10); run; The OBS= dataset option specifies the last observation to process from an input dataset. In the above example, regardless of dataset size, only the first 10 observations are printed; an easy way to
I remember the first time I was faced with the challenge of parallelizing a DATA step process. It was 2001 and SAS V8.1 was shiny and new. We were processing very large data sets, and the computations performed on each record were quite complex. The processing was crawling along on
With any software program, there are always new tips and tricks to learn, and nobody can know them all. Sometimes I even pick up tips or techniques from my students while they’re learning broader programming tips from me. Like fine wine, instructors only get better with age. Every customer interaction
With Pi Day coming up on 3/14, I wanted to make sure all you SAS programmers know how to use the pi constant in your SAS code... All you have to do is use constant("pi") in a data step, and you've got the value of pi out to a good many decimal places
While perusing the SAS 9.4 DS2 documentation, I ran across the section on the HTTP package. This intrigued me because, as DS2 has no text file handling statements I assumed all hope of leveraging Internet-based APIs was lost. But even a Jedi is wrong now and then! And what better
December is all about traditions. Some of mine include holiday shopping, baking (I really mean eating) Christmas cookies and putting together my annual list of most read blogs on the SAS Training Post. So as traditions go… here’s my list of the top 10 most read blogs in 2014. How
A student in a SAS class recently asked if there were a way to eliminate data error notes from the SAS log and, instead, write them to a separate file. Of course there's a way! Here's a simple datastep. Notice the missing dollar sign to indicate the variable GENDER (M,
A student brought in this coding problem after her manager was struggling with this issue for a while. They played guessing games, but to no avail. Here’s what happened when they submitted data step and proc sql code using a WHERE clause with an INPUT function? data aileen; length hcn
In the first Star Wars movie, Obi-wan uses Jedi mind tricks to convince the stormtroopers that the droids they see are not the droids they're looking for. A colleague at SAS passed along a question from a SAS user where the column labels they were seeing were NOT the labels
When teaching statistics, it is often useful to produce a normal density plot with shading under the curve. For example, consider a one-sided hypothesis test. An alpha value of .05 would correspond to a Z-score cutoff of 1.645. This means that 95% of a standard normal curve falls below a
This SAS tutorial video will show you how to generate plots for two continuous numeric variables with Base SAS. Basic scatter plots, linear or curvilinear regression lines, confidence intervals or ellipses, and multiple plot overlays are demonstrated. To learn more about this topic, check out our SAS Programming 1: Essentials
In this tutorial video, you will learn to print a simple listing with Base SAS. You see how to write a PRINT procedure step to display a SAS data set. You also see how to use statements and options to subset observations and variables and enhance the report. Learn
These two tutorial videos will show you how to filter and sort data in Base SAS. In this first video, you will learn to use a WHERE statement in Base SAS to filter or subset SAS data. Data sets can be very large and filtering data enables you to select
In this video, you learn how to use a SET statement in the DATA step to read a SAS data set using Base SAS. The DATA step is a very powerful tool for data manipulation. You can watch more video tutorials by visiting: support.sas.com/training/tutorial/. To learn more about the SAS
In my younger years, I enjoyed celebrating St. Patrick’s Day in Savannah, GA. Did you know the city dyes the fountain water green and has a parade that attracts over 400,000 people? It is now time for me to find additional cities to celebrate the holiday. Based on the DriveTheNation
In this tutorial video, you will learn how to read comma-separated-value (CSV) files with Base SAS using a DATA step. This enables you to create a SAS data set copy that transforms each field in the raw data file into a variable in the data set. Watch and learn how...
Michele Ensor recently posted a wonderful blog with a graph of the 2014 Winter Olympics medal count. I'm going to further refine that graph, making it an Olympic graph ... on steroids! :) Here is Michele's graph: First, let's give it a few simple cosmetic changes. I always like to have
Here’s my latest tip on how to apply conditional highlighting to a SAS Enterprise Guide report using the Summary Tables task. The Summary Tables task is a great way to point and click your way to creating simple or complex reports. Conditional highlighting is just one additional feature you can
Sure, you have a great looking table and you produce it with PROC TABULATE. And then, bam! Your boss comes along and decides that since your output looks so good in Word, that he’d like that boilerplate paragraph inserted automatically. Currently, you produce the tables and then pass the RTF