Thinking about retiring in another country?

Have you ever thought about retiring in another country, where your money might go further? Well here's some quantitative data to help you make an informed decision! ...

First, to get you in the mood, here's a picture of my friend Erik checking out the prices at a pedal-powered food cart in Thailand. Erik and his wife Joy have done more world traveling than any of my other friends, so they probably have good insight into what it might be like to retire in another country.


I recently ran across some interesting information on They had combined data from several different sources to come up with several indices that can be used to compare the prices of various things in different countries: Consumer Price Index, Rent Index, Groceries Index, Restaurant Index, and Local Purchasing Power Index. They let you select the data, and plot a map such as the following:


I'm glad they mapped the data (it's much easier to analyze than just a table), but I guess you could say I'm a little picky about my maps. I'm not a big fan of continuous color gradients (it's just too difficult to look at a continuous shade, and determine what value it represents, compare it to other countries, etc). I'm also not a fan of the projection they used (much of the available space is consumed by Greenland and Northern Canada ... which aren't really important in this analysis). So, of course, I decided to try creating my own maps using SAS.

I decided to go with 5 colors, and assign an equal number of countries to each - this way each color represents 1/5 of the countries (quantile binning). I also used a projection that de-emphasizes the extreme northern areas, and allows the other (more populous) areas to make use of more space. Here's the Rent Price Index map, for example (click the image to see all 5 maps):


Technical Details:

I copied the data from the page and pasted it into an Excel spreadsheet, and then used Proc Import to get the data into a SAS dataset. I used Proc Gmap to draw the map, and the levels=5 option to perform the quantile binning. You can see the complete SAS code here.

So, after reviewing the data, what country would you like to retire to? What are some other factors to consider, in addition to these indices?

Post a Comment

What's the solar power potential in your area?

Have you ever wondered whether the area where you live is a good location for producing solar power?  Let's create a SAS map to help find out!

To get you in the right frame of mind, here is an awesome picture of some Arizona sunshine, that my good friend Eva took on one of her recent trips:


There's been a lot of buzz lately about solar power - especially now that the price of solar panels has come down. SAS has a solar farm here at the Cary headquarters, and I've seen several other solar farms popping up around our state lately.

But I got to wondering - are certain parts of the country better than others for producing solar power? Intuitively, it seems like certain areas that don't have much clouds & rain would be better than areas that are generally cloudy & overcast. But how can I quantify that?

After a few Web searches, I found some data at NASA's Atmospheric Science Data Center. They let you enter a latitude/longitude, and provide an html table which contains the "Daily solar radiation - horizontal, kWh/m2/d". So I wrote some SAS code that looped through a grid of all the latitudes/longitudes I wanted to plot on a map, and then parsed the desired data out of each of those html pages and appended them to a SAS data step (the code is pretty neat, if you want to have a look at it!)

I then used Proc Ginside to determine which points in my lat/long grid were 'inside' the US, and then used annotate to plot color-coded dots on the map to represent the solar data. I think the map came out pretty cool:



While I was grabbing the solar data, it was also easy to grab the wind data - so I went ahead and created a wind map also. This map might indicate which areas of the country have more wind, and might be better for windmills and wind turbines:



And now for something fun - here's a video clip of me on one of my adventures, in a very windy location (hopefully you can view a .wmv file). Can you guess this windy location?!?


Post a Comment

Panning for corporate gold requires gold standard skills

We now live in the era of ‘big data’, where data and its analysis have become crucial to the modern economy.   In fact, "big data is the new 'corporate gold'," according to Mark Wilkinson, managing director of SAS UK & Ireland.

A recent study by Cebr found that companies in the UK are increasingly assigning a financial value to their data; with nearly three-quarters of business leaders seeing real benefits from using analytics to increase revenue, reduce costs and make decisions faster.

BigDataOpportunitiesBut ‘panning’ for that corporate gold requires gold standard skills.  Research published by SAS UK & Ireland and the Tech Partnership revealed that by 2020, there will be 56,000 job opportunities a year for big data analysts.  However, serious skills shortages are emerging with recruitment companies reporting that 77 percent of positions were either “very” or “fairly” difficult to fill.  Tech Partnership Director, Karen Price, said, "Investment in education and training opportunities is vital to securing a strong talent pipeline for the digital economy."

To address this, we’ve launched a new “SAS Data Scientist Curriculum” to give both students and experienced data scientists the knowledge and skills to better prepare, analyse and extract value from big data.

What’s more, we’ve also been awarded a Gold Accreditation from the Learning and Performance Institute (LPI) for our education programme. With more than 95 percent of customers rating our instructors, the courses and course administration as “excellent” or “good” and willing to  recommend these courses to others.

Solid gold proof that SAS Education can help you to take full advantage of the opportunities offered in the age of big data.



Post a Comment

Millennials will outnumber Baby Boomers in 2015

To get into the mood for this blog post, you should first listen to the music video of The Who singing My Generation...

I guess everybody has 'their generation' and here in the U.S. the most famous generation has been the Baby Boomers. Many companies have tried to design products they think the Baby Boomers would like (such as the 1964 Ford Mustang), to capitalize on the similar interests and buying power of the boomers. But this year, for the first time (according to this article from Pew Research) another generation will become the most populous in the U.S. - the Millennials!

The Pew article had a nice graph that shows which years people from each generation were born in, and how many people were born each year (note that there's not 100% agreement on when each generation starts & stops, but we'll go with these numbers for now).  Here is the graph from their article:

Read More »

Post a Comment

Jedi SAS Tricks: DS2 & APIs - GET the data you are looking for

While perusing the SAS 9.4 DS2 documentation, I ran across the section on the HTTP package. This intrigued me because, as DS2 has no text file handling statements I assumed all hope of leveraging Internet-based APIs was lost. But even a Jedi is wrong now and then! And what better API to test my API-wielding skills than the Star Wars API (SWAPI)? Read More »

Post a Comment

Learn how to maximize your data with SAS and Hadoop

California or bust

California or bust

Outside, the Cary, NC sky is gray and winds are blowing freezing rain, but a group of statisticians at SAS are channeling warm green hills and the soft, gold light of a California evening. Team conversations alternate between distributed processing, PROC IMSTAT and how many pairs of shorts to pack.

For the past several months, the Advanced Analytics training team here in Cary have been hard at work developing a course especially for the Strata+ Hadoop World conference entitled Machine Learning and Exploratory Modeling with SAS® and Hadoop. I’m very excited about this unique course. It blends many topics, and focuses exclusively on enhancing and refining students’ analytic skills in Hadoop.

The course will be held in San Jose, CA Feb 17-18 and was created for analytic professionals who want to make the most of their big data with Hadoop and SAS by incorporating high-performance, machine learning algorithms with predictive modeling best practices.

On the first day, we’ll primarily spend time using SAS Visual Analytics and Visual Statistics to perform analyses using the point-and-click interface. Because there will always be a need to do more than you see in the GUI, the second day is devoted to using PROC IMSTAT and High-Performance procedures for predictive modeling and text analytics, and the RECOMMEND procedure to build a recommendation system.

Featuring this course at the Strata conference is the perfect fit and a great value for analytic professionals. Your course registration fee includes a 2-day Expo Hall pass. This gives you the opportunity to network with data science professionals from around the world, who are experienced in many different technologies.  Good news! We are offering a special 20 percent discount to SAS customers. To take advantage of the discount, register using the promo code SASML. I’d love to see you there.

Post a Comment

Using SAS analytics to monitor blog posts

As a blogger, I often wonder whether my blog posts are 'successful' - and being a graph guy, I like to visually analyze the data, to try to answer that question.

The most common measure of a blog post is probably the number of times it was viewed, so I guess the simplest approach would be to rank your blogs by the number of views and look at the top 'n'. Here's a list of the Top 10 most viewed blogs that I posted in 2014 (you can click the image below to see the interactive list, with drilldown links):


Such a list seems like a good metric, but it doesn't factor in time. The longer a blog post is out on the Web, the more views it's likely to accumulate ... which means it's not really fair to compare posts I made in January to posts I made in December. Therefore I prefer to graph the data, and then look for the 'outliers'. Click the snapshot below, and you can see the interactive version, with hover-text and drilldown for the plot markers (each marker represents a blog post). This plot shows the two posts that stood out most in 2014 were the ones on  Disappearing Airplanes and Free SAS Software.


What about other metrics? - Well, there are the number of times a blog post has been tweeted about. Hopefully the people who tweet about your blog are promoting/sharing it (as opposed to making fun of it, or pointing out how bad it is, LOL), therefore let's assume that "more is better" when it comes to tweets. Looks like my blog posts on Shark Week and Santa's Dashboard got the most tweets.


Another metric is the number of Facebook 'Likes' a blog post receives. I pretty much take this at face value - the reader has a Facebook account, and they actually liked the blog post enough that they felt compelled to click the 'Like' button. My most-Liked posts in 2014 were Disappearing Airplanes, Mead, and Santa's Dashboard:


And one final metric - the number of comments a blog post receives. This might be considered a measure of how well the post has 'engaged' the reader (on the other hand, sometimes comments are questions about a confusing post, or corrections to an error in a post). My blog post receiving by far the most comments was the one announcing the Free SAS Software.


So, did the blog post(s) you found most memorable/useful stand out in any of these graphs?

What kind of blog posts would you like to see more of in 2015?

Post a Comment

Don't let difficult graph legends get your goat

Is this blog post about techniques to use on difficult graph legends, or is it about goats? The answer is both!

But first, to get you into the proper mood, here is a picture my friend, Mark, took of some cute goats. And some links to YouTube videos about goats standing on things, and balancing on a fun/wobbly roof.


OK - now for the technical part of the blog!

My friend Julianna recently forwarded me an interesting article from the Washington Post about goats in the U.S. Apparently there are over 2.5 million goats (who knew?!?), and the article had a map showing where all these goats are located.

Their map was pretty good, but I wanted to see if I could create an improved version using SAS. I tracked down the data on the USDA National Agricultural Statistics Service website, and set up a query to download the data in CSV format. I imported the data into SAS using Proc Import, and then started graphing.
Read More »

Post a Comment

Solving Sudoku with SAS/IML – Part 1

Sudoku solvers have been written in SAS using a variety of methods (e.g., the DATA step, PROC SQL, and PROC CLP). Surprisingly, SAS/IML appears to have been overlooked for this purpose. On a challenge from a coworker, I wrote this blog post to demonstrate the flexibility of SAS/IML in the context of solving Sudoku puzzles. This topic is split into two parts. Part 1 (this post) describes how Sudoku puzzles can be treated as exact cover problems and, in many cases, solved with simple logic. Part 2 describes how advanced Sudoku puzzles can be solved through a combination of the simple solver presented in part 1 and an efficient backtracking algorithm developed by Donald Knuth (which he labeled Algorithm X). Part 2 will be released in a separate post in the next two weeks.


A Sudoku puzzle consists of a partially filled-in 9x9 grid. The grid contains 9 rows, 9 columns, and 9 3x3 boxes. Each cell must contain a single integer between 1 and 9. Each row, column, and box must contain every integer between 1 and 9.

Exact Cover

Before examining how Sudoku can be represented as an exact cover problem, let’s discuss exact cover problems generally. Consider a set containing the integers 1 through 9:  X = {1,2,3,4,5,6,7,8,9}. Now consider a collection of subsets of X, A-F.

A = {1,3,9}
B = {1,2,3}
C = {2,8,9}
D = {4,5,6}
E = {2,7,8}
F = {1,5,8}

Each subset contains some, but not all, of the numbers in X. The solution of an exact cover problem is the collection of subsets that represent every number in X exactly once. Because subsets A, D, and E represent every value in X once and only once, the subcollection {A,D,E} is said to be an exact cover of X.

A = {1,3,9}
D = {4,5,6}
E = {2,7,8}

Read More »

Post a Comment

Using SAS to help locate the 100 best restaurants in the US

I recently read an article that listed the 100 best restaurants in the US - but the article didn't have a map. I decided to use my SAS skills to change that!

When it comes to restaurants, I eat out a lot (and by 'a lot' I mean I never eat at home, lol). My personal favorites are Bojangles, Ole Time Barbecue, and D&S Cafeteria - you can probably tell I don't have the most sophisticated dining tastes, but I do like good food! Every food blog needs a photo of a great meal to get you in the mood, and here's a picture from my foodie friend Claudia - now that's some good eatin'! ...


And now, back to the Top 100 list...

When I recently saw's article I thought it would be interesting to check their list and see how many of the restaurants were close to where I live, and see if I had perhaps eaten at any of them. When I tried to check their list (which was sorted alphabetically by restaurant name), it was a daunting task!  It contained pictures of all the restaurants, and their names & addresses - but there was no map, and no way to easily narrow it down to a certain geographical area (such as the area where I live, or the areas I've visited). Here's a screen-capture of how they present the info:


What would a SAS Graph guy do in a situation like this?...

You guessed it! -- I saved the html code from the web page into a text file, and wrote some SAS code to parse the restaurant names & addresses out of the html. I then used Proc Geocode to estimate the latitude & longitude of each restaurant, and programmatically annotated markers at those locations on a SAS/Graph gmap. You can hover over the markers to see the restaurant names, and click on the markers/states to see a test list of all the winning restaurants in that state. And in the text list, you can click the restaurant name to see the information about that restaurant.

Here's a snapshot of my SAS map - click it to see the interactive version with html hover-text and drilldowns:


With my SAS map & drilldown table, I was able to quickly see that 6 North Carolina restaurants made it into the Top 100, and 5 of those 6 are very close to Cary (where I live). One is actually right beside SAS headquarters - Herons - and I have eaten there!

How many of the Top 100 have you eaten at?  If you ever come to SAS headquarters, perhaps you can eat at Herons and add it to your list :-)

Post a Comment