If US state borders were redrawn -- which new state would you live in?

Find out which state you'll live in, if the US state borders are redrawn so we have 50 states with equal population! (Don't worry! - This is just a fun/hypothetical "what if" blog!)

To get you in the mood for this topic, here's a picture of one of the many vintage globes my good friend (and antique dealer) Reggie has in his personal collection. I always like looking at old globes to see if I can find borders & country names that have changed, and then try to guess the age of the globe based on that information. ... Which brings us to the topic at hand - changing the borders and names of the 50 US states!


For those of you not familiar with the US, it is divided into 50 states (represented by the 50 stars on our flag), and the population of some states is huge (such as California, with 38 million people), while the population of other states is small (such as Wyoming, with 600,000 people). This population disparity makes it a bit awkward when trying to determine how much influence each state has in the US government, etc.

I recently saw a map Neil Freeman created, that proposed dividing the US into 50 new states with an equal population in each area. Being a "map guy" myself, this caught my attention. I read the details in his article, and I liked the factors that he had used in coming up with the new groupings - "The map began with an algorithm that grouped counties based on proximity, urban area, and commuting patterns. The algorithm was seeded with the fifty largest cities. After that, manual changes took into account compact shapes, equal populations, metro areas divided by state lines, and drainage basins."


But, in scrutinizing his map, I found that I couldn't easily determine which of the current states & counties were included in the new proposed states. Therefore I created my own version of the map with SAS, which could easily answer those questions!

In my SAS version, I use the counties as my basic building blocks, and I add html hover-text so you can hover over any county and see the current state & county, and the proposed new state name. I also annotate the state borders (in white) so you can easily see how the current state borders compare to the new state borders. Therefore, while Neil's map makes a better static poster (which was his goal, by the way), the SAS map provides more analytic capabilities and insight. ... And having both maps gives you the best of both worlds!

Here is my SAS map. Click the static thumbnail below, to see the full-size interactive map with html hover-text.


If this change were to happen, what new state would your current residence be in, and what new state would you prefer to live in (and why)?

Post a Comment

Your SAS Visual Analytics expert is here

Ask the expertWouldn’t it be nice to have an expert to answer your SAS Visual Analytics question?

Now you do and the best part, it’s free!

Beginning September 8, SAS will be hosting a one-hour Ask the Expert session each Monday through November. Each interactive session will focus on a specific Visual Analytics topic area as indicated below:

SAS VA schedule

After each session, a session recap will be posted to the SAS Visual Analytics Community. The session recap will include a link to the session recording as well as a list of question and answers. For each question we will include links to additional details and more information. Make sure to check out the session recap whether you can attend the live session or not.

Please start gathering your questions and plan to attend one or all twelve sessions. Use this link to register for each session that you would like to attend.

We look forward to answering your questions and helping you to get the most out of SAS Visual Analytics!

Post a Comment

Not the same ol’ middle tier

The SAS Middle Tier is all new for SAS 9.4, and you might not recognize it. Gone are the third-party web application servers. Gone is the third-party Java development kit. Missing major components like those may cause you to ask yourself, “Does SAS still have a middle tier?”  With the introduction of the all new SAS Environment Manager the answer is an emphatic, “YES!” The third-party products are not needed because every component is 100 percent SAS homegrown.

The new SAS Middle Tier Architecture looks like this:


No need to worry. We understand that even though you see some familiar applications, much of this diagram is likely new to you. To help you navigate the new SAS 9.4 Middle Tier we introduce to you the new SAS Platform Administration: Middle Tier Administration course.

A highlight of the course is the SAS Environment Manager, a web-based administration tool for the SAS environment. The following screen shot shows how the SAS Environment Manager displays the status of your entire environment in one dashboard:



The availability column is refreshed every minute indicating the health of the resource in question.

While more functionality will be added to the SAS Environment Manager over the SAS 9.4 lifecycle, it is not yet a replacement for its older brother, SAS Management Console. SAS Management Console still rules when it comes to managing metadata access.

You can learn more by checking out the SAS 9.4 Intelligence Platform: Middle-Tier Administration Guide. Or view the administration curriculum path to see what else is offered.

Post a Comment

Tracking Ebola: Layering customized SAS maps

There are many ways to use SAS in Health & Life Sciences, and one of my favorite is using it to track the spread of diseases. This post demonstrates how to layer several customized maps to track the recent Ebola outbreak in Africa.

For those of you who are impatient and want to "cut to the chase" here's my final map. It is a SAS version of a map that appeared in a recent article on the bigmedicine.ca website. Click the thumbnail below to see the full size interactive version of my map with html hover-text.



And here are the technical details of how I created this map...

I used the new mapsgfk maps that we started shipping in SAS 9.4 for this example. I started with the mapsgfk.Africa continent map, and used Proc Gproject to chop out just the rectangular area, based on latitude/longitude values, that was in the original article. (This is a technique I learned from Mike Zdeb's wonderful book Maps Made Easy using SAS.)

proc gproject data=mapsgfk.africa out=my_map latlong eastlong degrees 
 latmax=14.0 latmin=0
 longmin=-20 longmax=-3.2;
id id;

I created some 'map data' so that the affected countries mapped to the light yellow color, and left the other countries gray (using 'cdefault=grayE1').


 That was the easy/straightforward part :) Now I need to add the individual country maps for Guinea and Liberia. I was able to use mapsgfk.Guinea as-is, but mapsgfk.Liberia's areas were 1 level more granular than I needed (such as US counties, instead of states), therefore I had to use Proc Gremove to group those lower-level areas into the desired areas. Once I had created these two maps, I combined them with the previous map, layering them such that these two new maps come last in the dataset, so they will show up 'on top'.


We've now got a map with the countries and areas together, but it's difficult to determine exactly which country the smaller areas are a part of. Therefore I take the country borders, and create an annotate dataset to draw a dark outline around each country using the poly/polycont annotate commands. This is an improvement over the original map, by the way!


And to finish it off, I annotate country & city names at specific latitude/longitude coordinates, and annotate a blue rectangle behind the map to represent the ocean. I annotate the 'Capitals' marker to the legend, and add the date to the bottom/right. I add html hover-text to all the countries and areas within the countries, using the Gmap html= option, so you can mouse over them and see what the names are (which is another enhancement over the original map). Scroll back to the top of the blog to see the finished map.

As you can see, you can do a lot of customizations in SAS maps, and you can layer maps (and also annotations) to create some really detailed maps to help visualize and analyze your data. Now that you know what's possible, what data do you think it would be interesting to analyze on a map?


Post a Comment

A tale of two administrators

You are the new SAS Administrator. After the initial shock or excitement, you sit back and wonder, “What does that MEAN???” In an enterprise environment there are often divisions of duties. The SAS Intelligence Platform is no exception. Just take a look at the architecture.

SAS Intelligence Platform

Just looking at this picture, you can probably tell there are potential turf wars here. As the SAS Administrator you may have to interact with database administrators, server or system administrators, middle tier administrators and even desktop support. In other words, the SAS Administrator will have to know enough to be dangerous about various aspects of the environment. Our SAS Platform Administration: Fast Track course is designed to cover the entire platform in a short amount of time.

But what if you are not the administrator of the entire domain? How do you understand the platform for SAS Business Analytics without drinking from the fire hose that is the SAS Platform Administration: Fast Track? The answer is: It depends. It depends on what type of administrator you are. The good thing about SAS is that it is flexible and configurable, but in the wrong hands, what is good can be bad. So let’s get you started in becoming the best SAS Platform Administrator you can be.

I have to begin by asking you some questions.

  • Do you administer the SAS Platform for Business Analytics, but you do not have administrator rights to the server?
  • Do you use SAS Management Console and support SAS applications, but you rely on someone else for your operating system administration?

If your answer to these questions is, ‘Yes, that sounds like me!’, then the new SAS Platform Administration: Metadata Administration course if for you! You would be what we call a SAS Metadata Administrator. Your focus is on metadata security, adding users, and business intelligence content, such as SAS Stored Processes or SAS Reports. This course teaches you the terminology and skills to manage the platform for SAS Business Analytics through use of SAS Management Console.

If the last paragraph bored you to tears and you have no desire to manage metadata, BUT you do manage server machines and the processes that run on them, you are not left in the dark. You are the type of administrator we call a SAS System Administrator. To you, your concern for SAS is from the operating system perspective. The course you should take is SAS Platform Administration: System Administration 9.4. With the knowledge gained in this course you will be able to:

  • administer and back up the SAS configuration and metadata
  • administer, monitor, log and troubleshoot the SAS Metadata Server and other SAS processing servers
  • utilize the SAS Environment Manager.

The courses are independent of each other, so you take the course relevant to the type of SAS Administrator you are. If you need both, the fact that they are independent from each other helps because you can take them in any order. Now that you know the type of SAS Administrator you are, we look forward to seeing you in class soon!

If you are completely new to the SAS platform, you may want to take the Getting Started with the platform for SAS Business Analytics course which provides an overview of the platform and the various client applications that it supports. The course is valid regardless of which platform administration course(s) you take.

You can view all of the platform administration courses on the administration curriculum path.

Post a Comment

Oh buoy! It's time for some Shark Week graphs!

With Discovery Channel's Shark Week starting on August 10, I decided to sink my teeth into some shark-attack data - I even found there were some shark attacks in the Midwestern US! Read on to learn the details...

To get you into the shark mindset, here is a photo of an almost 7-inch fossilized tooth from a prehistoric megalodon shark. My friend Rochelle found it while diving off the North Carolina coast. This is about as big as they get, and would have belonged to a shark that was over 50 feet long!


Most people have a morbid curiosity when it comes to sharks attacking humans (especially after the movie Jaws). So I did a bit of searching, and found the sharkattackfile.net website that maintains an impressive list of shark attacks ... but I noticed they didn't have a very good interface to help explore and analyze the data. Therefore I downloaded their data, imported it into SAS, and set up a little proof-of-concept showing how SAS can provide a visual interface to help you quickly 'see' more about the data.

Click the map snapshot below to see the full-size interactive version, where the states have hover-text, and drill down to a table listing all the shark attacks in each state. The table then has links to the detailed pages for each individual shark attack.


Looking at the data plotted on a map, the first thing that jumps out at me is that several of the inland states in the middle of the US have had shark attacks! How in the world does that happen?!? I clicked those states to see the individual incidents in the table, and then clicked the link in the table to see the details. Sure enough, humans had been 'attacked' by sharks in those states! I'll let you investigate (as I did above) to find out what those details are :)

The map is color-coded by gradient shades of (blood) red. You can tell that Florida has the most shark attacks (probably due to having more shoreline and more days warm enough to go to the beach, etc), but it's difficult to tell exactly how the values vary from state to state. Therefore I also created a bar chart of the same data. Looks like North Carolina is in the "top 5" - yikes!


So tell us your "shark story"! Did you see the Jaws movie when it first came out? Have you ever had a close encounter with a shark? Feel free to leave a comment and tell us your shark tale! (... or is that 'shark tail'?)  ;-)



Post a Comment

Thanks a Million

In July, we trained our 1 millionth user. It’s a significant milestone for SAS.

In celebration, we want to recognize our customers and say “thank you” for making a commitment to us for your learning.

Now through Aug. 31, we are offering a special Buy One, Get One 50-percent off promotion for public classroom or Live Web courses.

To receive the discount, you must register for both courses by Aug. 31. You can choose from any of the hundreds of courses we offer between now and the end of the year.

When registering online, include both courses in the same shopping cart and type MILLION into the promo box for both courses. The 50-percent discount will be applied to the lower-priced course.

You may also register and receive the discount by phone at 1-800-333-7660. Just mention the promotion when registering.

Visit our special “Thanks a million!” web page for all of the details and instructions on the promotion.

It’s a rewarding time for us here at SAS Training, and we take great pride in the fact that our customers consistently rank our training as excellent. But we’re not resting on our laurels. We remain committed to helping you learn SAS by offering a number of programs and services designed to get you the training you need, when you need it.

Thank you again for continuing your SAS education with us. We hope to train a million more.

Post a Comment

There’s no ‘I’ in analytics

A few years ago I discussed the idea of analytic resources as ‘all-stars’ rather than ‘rock stars.’

While this previous blog certainly touched on the team aspect of analytic work, recently I’ve been thinking about just how much teamwork is required to make an analytics project successful.

From extracting data, transforming data, loading data (or loading then transforming for the ELT crowd), analyzing data, examining the results, sharing results, taking action up on those results and feeding those results back into data - a large network of individuals working as a team is required for a project to be a success.

Similar to a baseball team – and the motto "there is no ‘I’ in team" - if the teammates on an analytics project don’t work together the results will most likely be disappointing. Wanting to work as a team doesn’t always guarantees success. I’m sure you’ve seen your favorite athlete drop a ball or make a bad pass. But the willingness to be open and accept the idea that you are the member of a team on an analytic project will go a long way toward success.

The data experts need the input of the analytic experts, just as the analytic experts need the input and feedback from business. Let’s follow the chain – if the data person doesn’t know or understand the data requirements the analytic resources might be left with good old ‘garbage in, garbage out.’ If the analytic resources don’t understand the business needs, they may get data in great shape, come up with the most excellent of models that don’t tell the business *anything* and the results are never acted upon. Dollars are wasted by the business. Similar to a baseball team whose roster may include the highest paid, most fit and strongest athletes, but if they don’t work well together, they will not achieve successful results.

If you would like to learn more about building your analytics team and analytic teamwork, you can attend the Analytics 2014 conference in Las Vegas Oct. 20-21. Many speakers, including myself, will be presenting on ways to maximize your analytics talent. Also, pre-conference training is offered on October 19, and post conference training October 22-24, that will help your analytic teams excel!

Post a Comment

SQL Joins in SAS University Edition

Probably the most important thing you can learn in the free SAS University Edition is how to work with data. And one of the most powerful tools for working with data is Proc SQL ...

I've used Proc SQL in some of my previous blog posts for simple tasks (such as subsetting data), but this time we'll go a bit deeper and use it for something a bit more powerful - joining tables.

It is often the case that we maintain a data table with all the information about people (students, employees, customers, etc), and then in our daily transaction data we simply refer to them by some id number. That way we only have to maintain one copy of the people-data (name, address, age, gender, etc), we don't have to enter the same data multiple times (just the id number), and we don't have to store all that information for each transaction (only the id number).

In this example, I'm keeping the data very short and simple. We'll have a school class with 5 students, and the only data for each student is their name. Copy-n-paste the code into SAS University Edition and run it:

data students;
input idnum $ 1-5 name $ 7-50;
id001 John Doe
id002 Jane Doe
id003 Raj Patel
id004 Tran Park
id005 Jet Lee

Do you remember taking tests on Opscan sheets, with #2 pencils? For no particular reason, here's a visualization of one I created with SAS/Graph. This has nothing to do with the example, and is just here to jazz-things-up with a bit of color :)


Now, let's assume we have a table of grades. For this very simple example, we'll say the students have only had one test so far. Notice in this table we only store the student id number (not the full name).

data grades;
input idnum $ 1-5 test1;
id001 88
id002 95
id003 93
id004 99
id005 95

If we want to see a bar chart of the grades, we can use the following simple code ... but it is difficult to tell which student is which, with only the student id numbers labeling each bar:

proc sgplot data=grades;
hbar idnum / response=test1;


And this is where the SQL join comes into play... You can use the following code to add the student name to the grades table. And while we're at it, let's order the data by the test1 score, so we can have the bars in ascending order:

proc sql;
create table plotdata as
select unique grades.*, students.name
from grades left join students
on grades.idnum=students.idnum
order by test1;
quit; run;

Now when we plot the data, we can label each bar with the student name, and order the bars by the data-order:

proc sgplot data=plotdata;
hbar name / response=test1;
yaxis discreteorder=data;


Remember - this is a simplified example, just to demonstrate the technique of SQL joins. Now, use your imagination and come up with ways to apply this technique to other data you might have, and you will soon become a highly paid SQL expert! :)

Post a Comment

When did 'your music' become 'classic rock'

In this blog post, I put some classic rock song data under the SAS Analytics microscope, to see if I could get a better picture of exactly what is considered 'classic rock' these days...

Michael Raithel recently pointed me to an interesting article/study about 'classic rock' music, and invited (or is that challenged?) me to see what I could do with this data using SAS graphics. Being a graph guy *and* a DJ, how could I turn down such an opportunity!?!

Here's a picture of my DJ setup. I've played quite a bit of classic rock, so hopefully I qualify as a subject matter expert (SME) in this area, LOL!


The first question that popped into my head was "where are these 25 radio stations located?" I used Proc Geocode to determine the latitude/longitude centroid of each city, and plotted them on a map. It looks like the stations are pretty well spread out across the US, but not too many from what I consider "the deep south" - therefore the results might not have as much 'southern rock' as I would have liked. While I was creating this map, I decided to add html hover-text, so you can see the "top 10" most frequently played songs for each station (click the snapshot image below to see the interactive map with the hover-text):

Classic Rock Stations map

Since the data had a timestamp of when the songs were played, I thought it might be interesting to see if certain songs were played at certain times, etc. But after plotting the data on a timeline, I found that the timestamps were not consistent enough for such a study. Some stations had the song timestamps down to the hour or minute, while others appeared to just have a daily summery (one timestamp per day).


The original article had a nice histogram, showing the distribution of the songs by their release year. I decided to create a similar histogram, but in mine the height of the bars represent the frequency of how many times the songs were played, and I show visible dividers between each song (so you can 'see' which songs were played more than others), and I add html hover text so you can see the names of the songs (click the snapshot image below to see the interactive graph with the hover-text).


And for my final visualization, I decided to come up with a totally different chart (not in the original article). I calculated what were the 20 most-played artists overall, and then created a bar chart showing how often (% of time) each of those artists was played at each of the 25 stations. I wanted to see if a small number of artists was played a 'majority' of the time (which is what it seems like, when I listen to classic rock stations). And, sure enough, one of the stations actually did play the top 20 artists over 50% of the time! Click the snapshot image below to see the interactive graph with the hover-text and drilldown links (be sure to try the drilldowns - on the bar segments, and the bar labels!)


 Did you 'discover' anything interesting in these graphs? What's your favorite "classic rock" song?

Post a Comment