Solving the analyst’s toughest problems with SAS

If you’re an analyst, you know discovery in a complicated data set is one of the toughest problems to solve.

But did you know the Business Knowledge Series course, Exploratory Analysis for Large and Complex Problems Using SAS Enterprise Miner, can help you solve those issues by tackling real-world problems?

I interviewed instructor Jeff Zeanah at last year’s Analytics Conference in Orlando about the benefits of taking this course.

 

Post a Comment

Why learn SAS instead of Excel? ... Medicare data

Would you like to analyze the 9 million lines of Medicare payment data that was recently made public? You'll need lots of luck if you're planning to use Excel ... whereas this is the kind of thing SAS was built for!

Their data download page warns that trying to import the raw data into Excel "will result in an incomplete loading of data" and they provide the data split into 12 separate Excel spreadsheets. By comparison, I downloaded the raw text data and easily imported into a single SAS dataset, and analyzed it ... using a simple laptop PC. It's good to be a SAS user! :)


Now for some analytics...

I'm a big fan of using analytics to help detect (or better yet, to deter) waste and fraud, and that's why I was excited to hear that the Medicare payment data was now available. Our local news mentioned that "344 out of more than 825,000 doctors, received $3 million or more apiece — a threshold that raises eyebrows for the government's own investigators." This was an interesting number, therefore I decided to try to crunch & summarize the raw data, and see if I could come up with the same numbers.

I won't bore you with all the coding details (click here if you'd like to see the code) - basically I used the code they provided to import the data, a data step and an sql query to summarize it, and then plotted the results visually with Proc Gplot.

Here's a snapshot image of the graph, with the 344 red markers showing the individual providers receiving $3 million or more in Medicare payments. Click here to see the interactive plot, with hover-text on the red markers so you can see their names (the interactive plot is followed by a table with details about the 344 names):

medicare_payments graph, with hover-text

 Now that the basics are out of the way, what analytics would you like to see performed on this data, to look for suspicious payments and such? (Let us know in a comment!)

Post a Comment

Cool SAS infographic, with custom 3-D effect

Here's an example that shows SAS is more than just a tool to create analytic graphics - it can also be used to create 'cute' infographics! :)

An infographic that recently caught my eye had a fleet of airplanes stretching from overhead out to the horizon, along with some interesting info about empty seats on airplanes. I decided to create my own version, using SAS.

I created a radiant color-gradient sky/background by annotating concentric pies, of slightly different sizes, and slightly different shades of blue. The pies are a glob solid white pies of random sizes, with the text annotated on them. I display these annotated things on a blank Proc Gslide.

Then the tricky bit was the 3-D fleet of airplanes. I created a rectangular grid of airplanes by looping through the desired locations on the screen, and annotating a text label containing a font character shaped like an airplane. I mathematically size the airplanes such that the ones near the bottom of the screen are smaller (so they'll look farther away). And to get the 3-D perspective angle, I created a custom (non-rectangular) greplay template that "squeezed" the bottom of the grid of airplanes in closer together at the bottom.

Here's how it turned out:

empty_plane_seats

Note that the non-rectangular greplay template trick does not work with non-SAS/Graph font characters (such as the airplane character in the Webdings font), therefore I had to create my own custom SAS/Graph software font containing an airplane. To do that, I worked out the coordinates for the desired airplane shape on graph paper:

empty_plane_font

And then I turned those coordinates into a SAS/Graph software font using the following code:

LIBNAME gfont0 ".";
data figures;
input char $ ptype $ x y segment lp $;
datalines;
A W 100 200 0 P
A V 85 75 1 P
A V 85 85 1 P
(and so on...)
;
run; 
proc gfont data=figures name=figures
filled height=.75in showroman romht=.5in resol=4;
run;

I think the results turned out pretty nice. Does this give you any ideas about custom font characters, or trick/3-D effects you might want to use in your infographic? Feel free to share your ideas in a comment...

Post a Comment

SAS knowledge that's music to your ears

We are all busy. Between demands from our career, families and communities, I am willing to bet we easily fill the 24 hours we get each day. But sometimes we're in such a rush or so focused on getting our to-do list checked off that we miss out on some wonderful opportunities.

That's exactly what happened one January morning to more than a thousand people in a Washington DC metro station. That morning, the Washington Post conducted an experiment with Joshua Bell, one of the world's best classical violinists. Bell sat by the entrance of the station and for the next 45 minutes played a half-dozen classical pieces. During that time 1,097 people passed by. Typically Bell's concerts sell out in minutes and average more than $100 a seat. So, how many people stopped to appreciate what many would regard as the most elegantly played music in the world? A grand total of six.

Now, granted, most of these folks were on the way to work, which probably meant they were in a hurry, but that's the point. Sometimes we get so caught up in our routines that we miss opportunities that could change or enhance our lives.

As a SAS professional, you are a valuable asset to your company. To succeed and advance in your chosen profession, however, takes hard work, initiative and a dedication to improve your skills. At SAS, we provide a number of opportunities for you to tend to your professional growth by acquiring new SAS skills. And we provide learning opportunities that address a multitude of different learning styles.

If you like the traditional classroom, we offer face-to-face training in more than 30 training centers across the country and a full slate of classes online through our Live Web Classroom. If you learn best through self-study, we offer a host of self-paced e-learning courses and hundreds of books covering every area of SAS technology. And buying SAS books just got easier. In March, we launched the SAS store, where you can now purchase SAS books in both print and e-book formats online (U.S. only at this time). Ground shipping is absolutely free and you can find example code and data for books purchased from the store.

And unlike a world-renowned classical musician, SAS Training and Books are always available, so you don't have to miss out on a great opportunity to advance your SAS learning.

Post a Comment

SAS tutorial: Print a simple listing with SAS

In this tutorial video, you will learn to print a simple listing with Base SAS.

You see how to write a PRINT procedure step to display a SAS data set. You also see how to use statements and options to subset observations and variables and enhance the report.

 

Learn more about the topic in this video in SAS Programming 1: Essentials course.

For more free video tips on programming and analytics functions, visit SAS Tutorials.

Post a Comment

SAS tutorials: Create basic summary and frequency reports

In this SAS tutorial video, learn how to create basic summary reports with descriptive statistics using the MEANS procedure. You will learn how to control the statistics that appear in your reports, and also how to do grouping using classification variables.

 

In this SAS tutorial video, learn how to create one-way and two-way frequency reports using PROC FREQUENCY. You will also learn how to suppress statistics that you don't want to appear in your tables and how to make other modifications to the tables.

 

To learn more about this topic, check out our SAS Programming 1: Essentials training course.

For additional, free SAS tutorials, visit our Video Portal.

Post a Comment

SAS tutorial: Creating a new variable in SAS

This SAS tutorial video will show you how to create a new variable with Base SAS.

During the step-by-step video, you will see exactly how to create and modify numeric and character variables using assignment statements in a DATA step.

Watch and learn…


To learn more about the steps in this video, check out our Programming 1: Essentials course.

We have many more tutorial videos on SAS programming and analytics functions, as well as learning SAS Studio and SAS Visual Analytics. Go to SAS Tutorials for more.

Post a Comment

How do your favorite TV episodes stack up ... in a graph?

Have you ever wanted to visualize the IMDb's ratings for your favorite TV show? Here's how you can do that with SAS software!

The Internet Movie Database (IMDb) has a huge amount of data about movies, TV shows, etc. One of the things they track is the user-ratings for individual episodes of TV shows. I recently saw a great Web site (graphtv.kevinformatics.com) that lets you enter a TV show name, and it plots the ratings of every episode of that show. Try it out, and I'm sure you'll agree it's pretty neat!

But being SAS users, I'm sure you'd like to see the IMDb data plotted using SAS software! So I've created an example that demonstrates exactly that. I've created three examples, with each plotting several shows in a category - the three categories are: famous shows, oldies, and Star Trek.

Below are snapshot images from each of the three categories - you'll want to click the snapshots to see the interactive version where the plot markers have hover-text and drill down links.

Famous:

Oldies:

Star Trek:

Do you agree with the ratings in the IMDb data? Were your favorite (and least favorite) episodes of these shows rated accordingly? Any 'surprises' in the data?

Hopefully I picked some of your favorite TV shows to include in my graphs. And if you've got other shows you'd like to see, download the ratings.list plain text data file from IMDb, and use my code to create your own plots of any show you'd like!

Post a Comment

SAS tutorial: How to take a random sample of data using SAS

For today's analytics SAS tutorial video, Marc Huber (of Stat Wars fame) will teach you how to take a random sample using PROC SURVEYSELECT in SAS. You will learn syntax for taking both a random sample and a stratified random sample.

 

You can watch more SAS tutorials like these by visiting our SAS Tutorial Video Portal.

To learn more about this topic, check out our training course: Probability Surveys 1: Design, Descriptive Statistics and Analysis

Post a Comment

How could SAS be used to help find missing planes?

I'm sure that anyone in the world would do whatever they could to help find the missing Flight 370 ... and this is my attempt to do what I can.

In my previous blog post, I showed how SAS could be used to visualize the locations where planes have disappeared. This blog takes it a step further, and shows how the specific data used in the search for missing Flight 370 could be analyzed by SAS and plotted on a map, all in a data-driven way.

I don't have the actual data, so I have estimated it by looking at various maps in the news for this proof-of-concept. The code is all data-driven, based on latitude/longitude coordinates, therefore it would easily be re-run with the real data. If you're a SAS user out there helping with this search, I'll be happy to provide you with the code!

Here is a snapshot of my map - you can click on it to see the interactive version with hover-text and drill down capability. Following the map, I explain what SAS techniques I used to create the map:

 I create my base map by starting with the world map that ships with SAS/GRAPH, and projecting it using a cylindrical projection, and then clip out just the area of interest. I add hover-text to each country, so you can hover you mouse over them to see the country names. I then use annotate to overlay a grid of dashed lines representing the latitudes and longitudes, and add a label to the end of each grid line.

For the 'data' part of the map, I start with the known data. I plot a marker at the locations of the airport where the plane took off, and where it was supposed to land. I add hover-text with the airport names, and you can click them to drill down and see a Google satellite map of the airports. Similarly, I draw a blue line following the known path that the airplane took, and plot a marker at the last known location. I add hover-text and when you click on that marker it drills down to a diagram showing details about the known portion of the flight.

 I then plot the location of the Inmarsat satellite in geosynchronous orbit above longitude 64-East, which collected the final 'ping' data from the plane. I used character '6b'x of the Webdings font for the satellite image, and add hover-text and drill down to a page with more information about the satellite. I draw a line from the satellite to the last known location of the plane, and then symmetrical arcs showing the possible location of the plane when it made its final ping.

Next I plot data about the unknowns. I plot some possible flight paths, using latitude/longitude coordinates along the path, and add a marker with hover-text at the end of each path. I use the annotate poly/polycont function to draw the polygons for the possible debris fields. I use alpha-transparent color for these, so they will not obscure any map details that might be below them. I add hover-text with some information about the debris fields, and when you click on them you see a visualization of the pieces of debris (remember that this is just a proof-of-concept, and I don't have the coordinates of real debris!) Here's an image of my simulated debris field:

In addition to visualizing the debris, we could also use SAS to analyze it in various ways. For example, we could calculate the "optimal tour" to visit all the pieces of debris using the TSP statement in SAS/OR's Proc OptNet, as described in one of my previous blog posts. This would hypothetically tell us what order to visit all the pieces of debris, while traveling the shortest distance (which would hopefully also be the shortest time).

 Now it's your turn - perhaps you can think of something in your area of expertise where SAS software could be used to help analyze the data. If you have an idea you'd like to share, feel free to leave it in a comment!

 

Post a Comment