Is Google Fiber coming to your city?

Google recently announced that they will be adding Google Fiber high speed network and TV to my area. This was great news, because it will give us more choices ... and a little competition among providers tends to make them all 'try harder' to please the customer. :-)

I was curious what other areas have Google Fiber, and did a few web searches and came up with a couple of maps. But neither of them allowed me to quickly/intuitively 'see' what I was wanting to see.

Here's the map from the Washington Post:


And here's the map from Google itself:


All I can really tell by glancing at those maps is the city locations. I have to study them intently, and read the color legend, and match up the (non-intuitive) markers to the color legend, to determine which are current/planned/potential Google Fiber cities.

So, of course, I set out to try to create a better map with SAS ... In my map, I wanted to make it easy to identify the cities that currently have Google Fiber, therefore I made their marker the brightest, and also added a check-mark in it. I made the 'planned' cities slightly lighter, with no check-mark. And the potential cities are just a light gray. Also, if you click the snapshot below and view the interactive version of my map, it has html hover-text and you can click on the markers to launch a Google search for more information about Google Fiber in that city.



What do you think of my new version of the map? What other changes & enhancements would you recommend?

Post a Comment

Eating liver and prepping data: How are they similar?

This week, I finally ate some liver, for the first time in over 20 years - and I realized it's a lot like prepping data (which I'll explain in this blog post). Here are a few of the similarities:

  • They're both good for you.
  • Thinking about them makes you go Eiwww!!!!
  • You might dread doing them, but find "it's not so bad" once you start.

And to give you a mental image, here's a picture of my friend Becky's daughter with the "Eiwww Face" most kids make when thinking about eating liver (or prepping data) ...


And now, back to this liver that I ate ... I knew it was "good for me" but being a data person I wanted to quantify "how good". So I did a few random Web searches, and found a page with this table of data comparing liver to several other foods.
Read More »

Post a Comment

Thinking about retiring in another country?

Have you ever thought about retiring in another country, where your money might go further? Well here's some quantitative data to help you make an informed decision! ...

First, to get you in the mood, here's a picture of my friend Erik checking out the prices at a pedal-powered food cart in Thailand. Erik and his wife Joy have done more world traveling than any of my other friends, so they probably have good insight into what it might be like to retire in another country.


I recently ran across some interesting information on They had combined data from several different sources to come up with several indices that can be used to compare the prices of various things in different countries: Consumer Price Index, Rent Index, Groceries Index, Restaurant Index, and Local Purchasing Power Index. They let you select the data, and plot a map such as the following:


I'm glad they mapped the data (it's much easier to analyze than just a table), but I guess you could say I'm a little picky about my maps. I'm not a big fan of continuous color gradients (it's just too difficult to look at a continuous shade, and determine what value it represents, compare it to other countries, etc). I'm also not a fan of the projection they used (much of the available space is consumed by Greenland and Northern Canada ... which aren't really important in this analysis). So, of course, I decided to try creating my own maps using SAS.

I decided to go with 5 colors, and assign an equal number of countries to each - this way each color represents 1/5 of the countries (quantile binning). I also used a projection that de-emphasizes the extreme northern areas, and allows the other (more populous) areas to make use of more space. Here's the Rent Price Index map, for example (click the image to see all 5 maps):


Technical Details:

I copied the data from the page and pasted it into an Excel spreadsheet, and then used Proc Import to get the data into a SAS dataset. I used Proc Gmap to draw the map, and the levels=5 option to perform the quantile binning. You can see the complete SAS code here.

So, after reviewing the data, what country would you like to retire to? What are some other factors to consider, in addition to these indices?

Post a Comment

What's the solar power potential in your area?

Have you ever wondered whether the area where you live is a good location for producing solar power?  Let's create a SAS map to help find out!

To get you in the right frame of mind, here is an awesome picture of some Arizona sunshine, that my good friend Eva took on one of her recent trips:


There's been a lot of buzz lately about solar power - especially now that the price of solar panels has come down. SAS has a solar farm here at the Cary headquarters, and I've seen several other solar farms popping up around our state lately.

But I got to wondering - are certain parts of the country better than others for producing solar power? Intuitively, it seems like certain areas that don't have much clouds & rain would be better than areas that are generally cloudy & overcast. But how can I quantify that?

After a few Web searches, I found some data at NASA's Atmospheric Science Data Center. They let you enter a latitude/longitude, and provide an html table which contains the "Daily solar radiation - horizontal, kWh/m2/d". So I wrote some SAS code that looped through a grid of all the latitudes/longitudes I wanted to plot on a map, and then parsed the desired data out of each of those html pages and appended them to a SAS data step (the code is pretty neat, if you want to have a look at it!)

I then used Proc Ginside to determine which points in my lat/long grid were 'inside' the US, and then used annotate to plot color-coded dots on the map to represent the solar data. I think the map came out pretty cool:



While I was grabbing the solar data, it was also easy to grab the wind data - so I went ahead and created a wind map also. This map might indicate which areas of the country have more wind, and might be better for windmills and wind turbines:



And now for something fun - here's a video clip of me on one of my adventures, in a very windy location (hopefully you can view a .wmv file). Can you guess this windy location?!?


Post a Comment

Panning for corporate gold requires gold standard skills

We now live in the era of ‘big data’, where data and its analysis have become crucial to the modern economy.   In fact, "big data is the new 'corporate gold'," according to Mark Wilkinson, managing director of SAS UK & Ireland.

A recent study by Cebr found that companies in the UK are increasingly assigning a financial value to their data; with nearly three-quarters of business leaders seeing real benefits from using analytics to increase revenue, reduce costs and make decisions faster.

BigDataOpportunitiesBut ‘panning’ for that corporate gold requires gold standard skills.  Research published by SAS UK & Ireland and the Tech Partnership revealed that by 2020, there will be 56,000 job opportunities a year for big data analysts.  However, serious skills shortages are emerging with recruitment companies reporting that 77 percent of positions were either “very” or “fairly” difficult to fill.  Tech Partnership Director, Karen Price, said, "Investment in education and training opportunities is vital to securing a strong talent pipeline for the digital economy."

To address this, we’ve launched a new “SAS Data Scientist Curriculum” to give both students and experienced data scientists the knowledge and skills to better prepare, analyse and extract value from big data.

What’s more, we’ve also been awarded a Gold Accreditation from the Learning and Performance Institute (LPI) for our education programme. With more than 95 percent of customers rating our instructors, the courses and course administration as “excellent” or “good” and willing to  recommend these courses to others.

Solid gold proof that SAS Education can help you to take full advantage of the opportunities offered in the age of big data.



Post a Comment

Millennials will outnumber Baby Boomers in 2015

To get into the mood for this blog post, you should first listen to the music video of The Who singing My Generation...

I guess everybody has 'their generation' and here in the U.S. the most famous generation has been the Baby Boomers. Many companies have tried to design products they think the Baby Boomers would like (such as the 1964 Ford Mustang), to capitalize on the similar interests and buying power of the boomers. But this year, for the first time (according to this article from Pew Research) another generation will become the most populous in the U.S. - the Millennials!

The Pew article had a nice graph that shows which years people from each generation were born in, and how many people were born each year (note that there's not 100% agreement on when each generation starts & stops, but we'll go with these numbers for now).  Here is the graph from their article:

Read More »

Post a Comment

Jedi SAS Tricks: DS2 & APIs - GET the data you are looking for

While perusing the SAS 9.4 DS2 documentation, I ran across the section on the HTTP package. This intrigued me because, as DS2 has no text file handling statements I assumed all hope of leveraging Internet-based APIs was lost. But even a Jedi is wrong now and then! And what better API to test my API-wielding skills than the Star Wars API (SWAPI)? Read More »

Post a Comment

Learn how to maximize your data with SAS and Hadoop

California or bust

California or bust

Outside, the Cary, NC sky is gray and winds are blowing freezing rain, but a group of statisticians at SAS are channeling warm green hills and the soft, gold light of a California evening. Team conversations alternate between distributed processing, PROC IMSTAT and how many pairs of shorts to pack.

For the past several months, the Advanced Analytics training team here in Cary have been hard at work developing a course especially for the Strata+ Hadoop World conference entitled Machine Learning and Exploratory Modeling with SAS® and Hadoop. I’m very excited about this unique course. It blends many topics, and focuses exclusively on enhancing and refining students’ analytic skills in Hadoop.

The course will be held in San Jose, CA Feb 17-18 and was created for analytic professionals who want to make the most of their big data with Hadoop and SAS by incorporating high-performance, machine learning algorithms with predictive modeling best practices.

On the first day, we’ll primarily spend time using SAS Visual Analytics and Visual Statistics to perform analyses using the point-and-click interface. Because there will always be a need to do more than you see in the GUI, the second day is devoted to using PROC IMSTAT and High-Performance procedures for predictive modeling and text analytics, and the RECOMMEND procedure to build a recommendation system.

Featuring this course at the Strata conference is the perfect fit and a great value for analytic professionals. Your course registration fee includes a 2-day Expo Hall pass. This gives you the opportunity to network with data science professionals from around the world, who are experienced in many different technologies.  Good news! We are offering a special 30 percent discount to SAS customers. To take advantage of the discount, register using the promo code SASML. I’d love to see you there.

Post a Comment

Using SAS analytics to monitor blog posts

As a blogger, I often wonder whether my blog posts are 'successful' - and being a graph guy, I like to visually analyze the data, to try to answer that question.

The most common measure of a blog post is probably the number of times it was viewed, so I guess the simplest approach would be to rank your blogs by the number of views and look at the top 'n'. Here's a list of the Top 10 most viewed blogs that I posted in 2014 (you can click the image below to see the interactive list, with drilldown links):


Such a list seems like a good metric, but it doesn't factor in time. The longer a blog post is out on the Web, the more views it's likely to accumulate ... which means it's not really fair to compare posts I made in January to posts I made in December. Therefore I prefer to graph the data, and then look for the 'outliers'. Click the snapshot below, and you can see the interactive version, with hover-text and drilldown for the plot markers (each marker represents a blog post). This plot shows the two posts that stood out most in 2014 were the ones on  Disappearing Airplanes and Free SAS Software.


What about other metrics? - Well, there are the number of times a blog post has been tweeted about. Hopefully the people who tweet about your blog are promoting/sharing it (as opposed to making fun of it, or pointing out how bad it is, LOL), therefore let's assume that "more is better" when it comes to tweets. Looks like my blog posts on Shark Week and Santa's Dashboard got the most tweets.


Another metric is the number of Facebook 'Likes' a blog post receives. I pretty much take this at face value - the reader has a Facebook account, and they actually liked the blog post enough that they felt compelled to click the 'Like' button. My most-Liked posts in 2014 were Disappearing Airplanes, Mead, and Santa's Dashboard:


And one final metric - the number of comments a blog post receives. This might be considered a measure of how well the post has 'engaged' the reader (on the other hand, sometimes comments are questions about a confusing post, or corrections to an error in a post). My blog post receiving by far the most comments was the one announcing the Free SAS Software.


So, did the blog post(s) you found most memorable/useful stand out in any of these graphs?

What kind of blog posts would you like to see more of in 2015?

Post a Comment

Don't let difficult graph legends get your goat

Is this blog post about techniques to use on difficult graph legends, or is it about goats? The answer is both!

But first, to get you into the proper mood, here is a picture my friend, Mark, took of some cute goats. And some links to YouTube videos about goats standing on things, and balancing on a fun/wobbly roof.


OK - now for the technical part of the blog!

My friend Julianna recently forwarded me an interesting article from the Washington Post about goats in the U.S. Apparently there are over 2.5 million goats (who knew?!?), and the article had a map showing where all these goats are located.

Their map was pretty good, but I wanted to see if I could create an improved version using SAS. I tracked down the data on the USDA National Agricultural Statistics Service website, and set up a query to download the data in CSV format. I imported the data into SAS using Proc Import, and then started graphing.
Read More »

Post a Comment