Tracking Ebola: Layering customized SAS maps

There are many ways to use SAS in Health & Life Sciences, and one of my favorite is using it to track the spread of diseases. This post demonstrates how to layer several customized maps to track the recent Ebola outbreak in Africa.

For those of you who are impatient and want to "cut to the chase" here's my final map. It is a SAS version of a map that appeared in a recent article on the website. Click the thumbnail below to see the full size interactive version of my map with html hover-text.



And here are the technical details of how I created this map...

I used the new mapsgfk maps that we started shipping in SAS 9.4 for this example. I started with the mapsgfk.Africa continent map, and used Proc Gproject to chop out just the rectangular area, based on latitude/longitude values, that was in the original article. (This is a technique I learned from Mike Zdeb's wonderful book Maps Made Easy using SAS.)

proc gproject out=my_map latlong eastlong degrees 
 latmax=14.0 latmin=0
 longmin=-20 longmax=-3.2;
id id;

I created some 'map data' so that the affected countries mapped to the light yellow color, and left the other countries gray (using 'cdefault=grayE1').


 That was the easy/straightforward part :) Now I need to add the individual country maps for Guinea and Liberia. I was able to use mapsgfk.Guinea as-is, but mapsgfk.Liberia's areas were 1 level more granular than I needed (such as US counties, instead of states), therefore I had to use Proc Gremove to group those lower-level areas into the desired areas. Once I had created these two maps, I combined them with the previous map, layering them such that these two new maps come last in the dataset, so they will show up 'on top'.


We've now got a map with the countries and areas together, but it's difficult to determine exactly which country the smaller areas are a part of. Therefore I take the country borders, and create an annotate dataset to draw a dark outline around each country using the poly/polycont annotate commands. This is an improvement over the original map, by the way!


And to finish it off, I annotate country & city names at specific latitude/longitude coordinates, and annotate a blue rectangle behind the map to represent the ocean. I annotate the 'Capitals' marker to the legend, and add the date to the bottom/right. I add html hover-text to all the countries and areas within the countries, using the Gmap html= option, so you can mouse over them and see what the names are (which is another enhancement over the original map). Scroll back to the top of the blog to see the finished map.

As you can see, you can do a lot of customizations in SAS maps, and you can layer maps (and also annotations) to create some really detailed maps to help visualize and analyze your data. Now that you know what's possible, what data do you think it would be interesting to analyze on a map?


Post a Comment

A tale of two administrators

You are the new SAS Administrator. After the initial shock or excitement, you sit back and wonder, “What does that MEAN???” In an enterprise environment there are often divisions of duties. The SAS Intelligence Platform is no exception. Just take a look at the architecture.

SAS Intelligence Platform

Just looking at this picture, you can probably tell there are potential turf wars here. As the SAS Administrator you may have to interact with database administrators, server or system administrators, middle tier administrators and even desktop support. In other words, the SAS Administrator will have to know enough to be dangerous about various aspects of the environment. Our SAS Platform Administration: Fast Track course is designed to cover the entire platform in a short amount of time.

But what if you are not the administrator of the entire domain? How do you understand the platform for SAS Business Analytics without drinking from the fire hose that is the SAS Platform Administration: Fast Track? The answer is: It depends. It depends on what type of administrator you are. The good thing about SAS is that it is flexible and configurable, but in the wrong hands, what is good can be bad. So let’s get you started in becoming the best SAS Platform Administrator you can be.

I have to begin by asking you some questions.

  • Do you administer the SAS Platform for Business Analytics, but you do not have administrator rights to the server?
  • Do you use SAS Management Console and support SAS applications, but you rely on someone else for your operating system administration?

If your answer to these questions is, ‘Yes, that sounds like me!’, then the new SAS Platform Administration: Metadata Administration course if for you! You would be what we call a SAS Metadata Administrator. Your focus is on metadata security, adding users, and business intelligence content, such as SAS Stored Processes or SAS Reports. This course teaches you the terminology and skills to manage the platform for SAS Business Analytics through use of SAS Management Console.

If the last paragraph bored you to tears and you have no desire to manage metadata, BUT you do manage server machines and the processes that run on them, you are not left in the dark. You are the type of administrator we call a SAS System Administrator. To you, your concern for SAS is from the operating system perspective. The course you should take is SAS Platform Administration: System Administration 9.4. With the knowledge gained in this course you will be able to:

  • administer and back up the SAS configuration and metadata
  • administer, monitor, log and troubleshoot the SAS Metadata Server and other SAS processing servers
  • utilize the SAS Environment Manager.

The courses are independent of each other, so you take the course relevant to the type of SAS Administrator you are. If you need both, the fact that they are independent from each other helps because you can take them in any order. Now that you know the type of SAS Administrator you are, we look forward to seeing you in class soon!

If you are completely new to the SAS platform, you may want to take the Getting Started with the platform for SAS Business Analytics course which provides an overview of the platform and the various client applications that it supports. The course is valid regardless of which platform administration course(s) you take.

You can view all of the platform administration courses on the administration curriculum path.

Post a Comment

Oh buoy! It's time for some Shark Week graphs!

With Discovery Channel's Shark Week starting on August 10, I decided to sink my teeth into some shark-attack data - I even found there were some shark attacks in the Midwestern US! Read on to learn the details...

To get you into the shark mindset, here is a photo of an almost 7-inch fossilized tooth from a prehistoric megalodon shark. My friend Rochelle found it while diving off the North Carolina coast. This is about as big as they get, and would have belonged to a shark that was over 50 feet long!


Most people have a morbid curiosity when it comes to sharks attacking humans (especially after the movie Jaws). So I did a bit of searching, and found the website that maintains an impressive list of shark attacks ... but I noticed they didn't have a very good interface to help explore and analyze the data. Therefore I downloaded their data, imported it into SAS, and set up a little proof-of-concept showing how SAS can provide a visual interface to help you quickly 'see' more about the data.

Click the map snapshot below to see the full-size interactive version, where the states have hover-text, and drill down to a table listing all the shark attacks in each state. The table then has links to the detailed pages for each individual shark attack.


Looking at the data plotted on a map, the first thing that jumps out at me is that several of the inland states in the middle of the US have had shark attacks! How in the world does that happen?!? I clicked those states to see the individual incidents in the table, and then clicked the link in the table to see the details. Sure enough, humans had been 'attacked' by sharks in those states! I'll let you investigate (as I did above) to find out what those details are :)

The map is color-coded by gradient shades of (blood) red. You can tell that Florida has the most shark attacks (probably due to having more shoreline and more days warm enough to go to the beach, etc), but it's difficult to tell exactly how the values vary from state to state. Therefore I also created a bar chart of the same data. Looks like North Carolina is in the "top 5" - yikes!


So tell us your "shark story"! Did you see the Jaws movie when it first came out? Have you ever had a close encounter with a shark? Feel free to leave a comment and tell us your shark tale! (... or is that 'shark tail'?)  ;-)



Post a Comment

Thanks a Million

In July, we trained our 1 millionth user. It’s a significant milestone for SAS.

In celebration, we want to recognize our customers and say “thank you” for making a commitment to us for your learning.

Now through Aug. 31, we are offering a special Buy One, Get One 50-percent off promotion for public classroom or Live Web courses.

To receive the discount, you must register for both courses by Aug. 31. You can choose from any of the hundreds of courses we offer between now and the end of the year.

When registering online, include both courses in the same shopping cart and type MILLION into the promo box for both courses. The 50-percent discount will be applied to the lower-priced course.

You may also register and receive the discount by phone at 1-800-333-7660. Just mention the promotion when registering.

Visit our special “Thanks a million!” web page for all of the details and instructions on the promotion.

It’s a rewarding time for us here at SAS Training, and we take great pride in the fact that our customers consistently rank our training as excellent. But we’re not resting on our laurels. We remain committed to helping you learn SAS by offering a number of programs and services designed to get you the training you need, when you need it.

Thank you again for continuing your SAS education with us. We hope to train a million more.

Post a Comment

There’s no ‘I’ in analytics

A few years ago I discussed the idea of analytic resources as ‘all-stars’ rather than ‘rock stars.’

While this previous blog certainly touched on the team aspect of analytic work, recently I’ve been thinking about just how much teamwork is required to make an analytics project successful.

From extracting data, transforming data, loading data (or loading then transforming for the ELT crowd), analyzing data, examining the results, sharing results, taking action up on those results and feeding those results back into data - a large network of individuals working as a team is required for a project to be a success.

Similar to a baseball team – and the motto "there is no ‘I’ in team" - if the teammates on an analytics project don’t work together the results will most likely be disappointing. Wanting to work as a team doesn’t always guarantees success. I’m sure you’ve seen your favorite athlete drop a ball or make a bad pass. But the willingness to be open and accept the idea that you are the member of a team on an analytic project will go a long way toward success.

The data experts need the input of the analytic experts, just as the analytic experts need the input and feedback from business. Let’s follow the chain – if the data person doesn’t know or understand the data requirements the analytic resources might be left with good old ‘garbage in, garbage out.’ If the analytic resources don’t understand the business needs, they may get data in great shape, come up with the most excellent of models that don’t tell the business *anything* and the results are never acted upon. Dollars are wasted by the business. Similar to a baseball team whose roster may include the highest paid, most fit and strongest athletes, but if they don’t work well together, they will not achieve successful results.

If you would like to learn more about building your analytics team and analytic teamwork, you can attend the Analytics 2014 conference in Las Vegas Oct. 20-21. Many speakers, including myself, will be presenting on ways to maximize your analytics talent. Also, pre-conference training is offered on October 19, and post conference training October 22-24, that will help your analytic teams excel!

Post a Comment

SQL Joins in SAS University Edition

Probably the most important thing you can learn in the free SAS University Edition is how to work with data. And one of the most powerful tools for working with data is Proc SQL ...

I've used Proc SQL in some of my previous blog posts for simple tasks (such as subsetting data), but this time we'll go a bit deeper and use it for something a bit more powerful - joining tables.

It is often the case that we maintain a data table with all the information about people (students, employees, customers, etc), and then in our daily transaction data we simply refer to them by some id number. That way we only have to maintain one copy of the people-data (name, address, age, gender, etc), we don't have to enter the same data multiple times (just the id number), and we don't have to store all that information for each transaction (only the id number).

In this example, I'm keeping the data very short and simple. We'll have a school class with 5 students, and the only data for each student is their name. Copy-n-paste the code into SAS University Edition and run it:

data students;
input idnum $ 1-5 name $ 7-50;
id001 John Doe
id002 Jane Doe
id003 Raj Patel
id004 Tran Park
id005 Jet Lee

Do you remember taking tests on Opscan sheets, with #2 pencils? For no particular reason, here's a visualization of one I created with SAS/Graph. This has nothing to do with the example, and is just here to jazz-things-up with a bit of color :)


Now, let's assume we have a table of grades. For this very simple example, we'll say the students have only had one test so far. Notice in this table we only store the student id number (not the full name).

data grades;
input idnum $ 1-5 test1;
id001 88
id002 95
id003 93
id004 99
id005 95

If we want to see a bar chart of the grades, we can use the following simple code ... but it is difficult to tell which student is which, with only the student id numbers labeling each bar:

proc sgplot data=grades;
hbar idnum / response=test1;


And this is where the SQL join comes into play... You can use the following code to add the student name to the grades table. And while we're at it, let's order the data by the test1 score, so we can have the bars in ascending order:

proc sql;
create table plotdata as
select unique grades.*,
from grades left join students
on grades.idnum=students.idnum
order by test1;
quit; run;

Now when we plot the data, we can label each bar with the student name, and order the bars by the data-order:

proc sgplot data=plotdata;
hbar name / response=test1;
yaxis discreteorder=data;


Remember - this is a simplified example, just to demonstrate the technique of SQL joins. Now, use your imagination and come up with ways to apply this technique to other data you might have, and you will soon become a highly paid SQL expert! :)

Post a Comment

When did 'your music' become 'classic rock'

In this blog post, I put some classic rock song data under the SAS Analytics microscope, to see if I could get a better picture of exactly what is considered 'classic rock' these days...

Michael Raithel recently pointed me to an interesting article/study about 'classic rock' music, and invited (or is that challenged?) me to see what I could do with this data using SAS graphics. Being a graph guy *and* a DJ, how could I turn down such an opportunity!?!

Here's a picture of my DJ setup. I've played quite a bit of classic rock, so hopefully I qualify as a subject matter expert (SME) in this area, LOL!


The first question that popped into my head was "where are these 25 radio stations located?" I used Proc Geocode to determine the latitude/longitude centroid of each city, and plotted them on a map. It looks like the stations are pretty well spread out across the US, but not too many from what I consider "the deep south" - therefore the results might not have as much 'southern rock' as I would have liked. While I was creating this map, I decided to add html hover-text, so you can see the "top 10" most frequently played songs for each station (click the snapshot image below to see the interactive map with the hover-text):

Classic Rock Stations map

Since the data had a timestamp of when the songs were played, I thought it might be interesting to see if certain songs were played at certain times, etc. But after plotting the data on a timeline, I found that the timestamps were not consistent enough for such a study. Some stations had the song timestamps down to the hour or minute, while others appeared to just have a daily summery (one timestamp per day).


The original article had a nice histogram, showing the distribution of the songs by their release year. I decided to create a similar histogram, but in mine the height of the bars represent the frequency of how many times the songs were played, and I show visible dividers between each song (so you can 'see' which songs were played more than others), and I add html hover text so you can see the names of the songs (click the snapshot image below to see the interactive graph with the hover-text).


And for my final visualization, I decided to come up with a totally different chart (not in the original article). I calculated what were the 20 most-played artists overall, and then created a bar chart showing how often (% of time) each of those artists was played at each of the 25 stations. I wanted to see if a small number of artists was played a 'majority' of the time (which is what it seems like, when I listen to classic rock stations). And, sure enough, one of the stations actually did play the top 20 artists over 50% of the time! Click the snapshot image below to see the interactive graph with the hover-text and drilldown links (be sure to try the drilldowns - on the bar segments, and the bar labels!)


 Did you 'discover' anything interesting in these graphs? What's your favorite "classic rock" song?

Post a Comment

I see spots ... sunspots!

The sun has gone eerily quiet, in the middle of what should be the height of the 11-year sunspot cycle...

Here's a superb photo of some sunspots that Stephen A. Carr posted to the Telescope Addicts Facebook group - a group which I follow with great interest. (Thanks for allowing me to use the photo Stephen!)

Stephen A Carr's sunspot photo

But Stephen would not have been able to take such a picture yesterday ... A article pointed out that NASA's Solar Dynamics Observatory had just recorded an All Quiet Event (no sunspot activity). And being in the middle of what should be the height of the 11-year sunspot cycle (ie, the solar maximum), that does seem a bit odd. It also seems like something I could see  better with some SAS graphs!

When it comes to visualizing the 11-year sunspot cycle, the graph shown on Wikipedia is probably the most famous. But the data in that graph only goes a few years past year 2000 (up through the previous 11-year cycle's minimum). I wanted to see the current maximum, in context of all the other data on the graph.

I located a source for the monthly mean sunspot numbers, and set up some SAS code to import them directly from the Web. I then transposes it so that the monthly columns became one long series that I could plot across a time axis (red and blue portions of graph). I used Proc Expand to calculate the moving average and overlaid it on the same plot (black line). And with the addition of a few annotated labels, I now have a graph almost exactly like the original, but also including the latest/greatest monthly sunspot numbers :)

Graph of 400 Years of Sunspot Observations

Plotting this much data on 1 screen makes it a little difficult to see exactly what the most recent cycle is doing. Therefore I created an additional plot, just showing the most recent ~11-year cycle (plotted against the same y-axis scale). It does indeed show a very wimpy number of sunspots, and also seems to indicate that we have perhaps crossed the peak of the cycle, and are on our way back down.

Most recent sunspot cycle

So, are any of my blog readers astronomers? What are your theories or observations on sunspots and solar cycles?

Post a Comment

Hot hot heat map

Although I’m not particularly excited about football (I admit, I don’t completely understand what offside means), I did follow the last World Cup with more than average attention. Not only for the handsome players, but especially for all the fascinating statistics that appeared. It struck me that heat maps popped up everywhere: on Twitter, in newspaper articles, in talk shows, … so I tried to find out why heat maps have become so popular.

What are the origins of a heat map?

heatmap1A heat map is any data visualization which uses color to represent data values in a two-dimensional image.

The term "Heat map" was originally coined and trademarked by software designer Cormac Kinney in 1991, to describe a 2D display depicting real time financial market  information. Heat maps actually originated already in the seventies as 2D displays of the values in a data matrix. Larger values were represented by small dark gray or black  squares (pixels) and smaller values by lighter squares. Nowadays we’re no longer stuck to those fifty shades of grey as there are many different color schemes that can be used to illustrate the heat map.

Heat maps have gained importance in the new era of big data. While in the past scatter plots were used on smaller datasets to discover trends and outliers that remain hidden on traditional charts and spreadsheets, heat maps have taken over. By applying color, numerous observations can be visualized together as the color is indicating the frequency of the pattern.

Different heat maps for different purposes

There are many different types of heat maps used in different disciplines, each referred to by the term “heat map”, even though they use different visualization techniques.

In PC gaming, heat maps can be used for testing game maps, to measure the feasibility of a map, which areas are being used more by the players, are there areas which are being overused or underused. This will help the developers to create a spatially optimized map.

In Web analytics, heat maps are used to see where the users of the websites are actually pointing their cursors and which part of the website are they spending most of their time.

In Biology, heat maps are typically used in molecular biology to represent the level of expression of many genes across a number of comparable samples (e.g. cells in different states, samples from different patients) as they are obtained from DNA microarrays.

Heat maps in football or any other sport are used to identify the frequency of events spread in a given particular area. Specifically for football, heat maps are an indicator of effectiveness of a player in different parts of the pitch.

Heat maps of the World Cup

A very appealing heat map from last World Cup, was the one of the match between Germany and Brazil. The visualization can help in explaining the 7-1 result. This view visualizes the on-ball actions of the teams. The color indicates the volume of events, with few events being highlighted by blue, and lots of events by red.

The map gets heated up in areas where the player has had more control of the ball and does most of his work, it turns redder as the player's presence in a particular area increases.


What can business users learn from this?

Heat maps can be produced very easily with SAS Visual Analytics. SAS customer Orlando Magic, an NBA team, is using heat maps for price and packaging planning of their ticket sales. Heat maps are also very popular in the banking sector. According to Michael Bryan, CIO of Bank of North Carolina, senior executives and board members particularly like heat maps when analyzing loan data. "You can draw 300 fields, and the heat map can tell you in different colors what components of that data are strong and which are weak," he says. "If you wanted to look at a pool of delinquent loans, you could see what they have in common."

If you want to learn more about heat maps, have a look at these posts by SAS colleague Rick Wicklin, statistical graph expert.

Visualize a matrix in SAS by using a discrete heat map

A Christmas tree matrix

Post a Comment

Use SAS to help plan your next vacation!

SAS is great at helping make important business decisions - how about helping decide where to take your next vacation?...

Here's a picture from one of my favorite vacations with my buddy Joe. As you can see, I like "nature vacations." Can you guess where this one was? (leave a comment with your best guess!)


I was looking around for potential vacation spots to visit this summer, and came across a cool article on the CNN website that listed one great natural wonder in each of the 50 US states. The article was laid out on a single page, and it took a lot of scrolling to find the states I was interested in. So I thought to myself, "Self - why don't you plot this data on a SAS map, and set up drilldowns in the map that jump to the desired state in the article?"

And that's exactly what I did! Click the thumbnail below to see the full-size interactive SAS map. Hover over the markers to see summary info, and click the markers (or the states) to jump to that state's section of the CNN article. It's a nice example of visual (and geographical) analytics, to show the power of SAS!



Post a Comment