Solar and wind power in the United States

Where is solar and wind power generated in the US? Let's visualize this data on a map...

I recently saw the following map on the website. It caught my attention because it looks like North Carolina has a lot of solar power plants, whereas our neighboring states have very few. This seemed odd at first glance, and made me wonder if their map might be incorrect ...


I had recently been mapping other data from the Energy Information Administration (EIA), and since I already had their data imported into SAS, I thought I'd try re-creating the wind/solar map.

I only made a few changes to their map. I decided to use the 2014 final data rather than the 2015 early release data. I made the title a little bigger, and I used circular markers for both solar and wind, rather than using 2 lines for wind (the circular markers are easier to see the mouse-over text when the markers are densely packed). Read More »

Post a Comment

Is the US still a coal-powered nation?

I ran across a map recently that seemed to show a lot of US states are primarily coal-powered. The map was a little difficult to read, so I decided to give it a SAS makeover ...

Before we get started, here's a picture my friend David took of the Shearon Harris Nuclear Power Plant, here in NC. We have 3 nuclear plants in our state, and this is the one that provides power for the SAS headquarters.


I recently found a web site called that shows maps of many interesting things. While perusing the site, I found the following interesting 'energy source' map.


At first glance, I assumed this map showed the primary energy source of the power consumed in each county, but as I looked closely I noticed that the counties with nuclear power as the primary source were only the counties that actually contained the nuclear power plants. I finally 'got it' that this map showed the primary source of energy produced in each county, not the power consumed in each county.

In addition to the counties being colored by source, each state also had a label showing the primary source of energy produced in the whole state. I think the state data is actually more important than the individual counties, but their state labels were very difficult to read.

So I set about creating my own map. I decided to split it into two separate maps - one for state, and one for county. With the state map, you can easily tell with a single glance that many of the states still use coal as their primary source of power (see the coal-black states below). Read More »

Post a Comment

What makes a data scientist?

The term Data Scientist is in vogue right now. Let’s explore Data Science, what key skills make a data scientist and how SAS’ Academy of Data Science may help advance your career.

The term "data science" (originally used interchangeably with "datalogy") has existed for over thirty years and was used initially as a substitute for “computer science” by Peter Naur in 1960.

Data Science (as we know it), is a relatively new and perhaps ever-evolving discipline. Its definitions vary and there is no standard approach towards this role, unlike traditional professions like accountancy and sales. The most common definition is available at Wikipedia which writes about Data Science: “Data science is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, which is a continuation of some of the data analysis fields such as statistics, data mining, and predictive analytics, similar to Knowledge (KDD).”

Some of the spicy definitions exist over the vastness of the internet which includes the one by Malcolm Chisholm: “There’s a joke running around on Twitter that the definition of a data scientist is ‘a data analyst who lives in California’” .  There are websites like Big Data Made Simple which compiles more than 10 definitions of Data Science.

Read More »

Post a Comment

91% of the US didn't vote for Hillary or Trump!

How is it that 91% of the US didn't vote for either Hillary or Trump in the primary, but yet they're still the final two candidates in the presidential election? Let's break it down with a simple graph!

I recently saw a really cool slideshow on the nytimes website, that answered this question very nicely. They used an 18x18 grid of squares, with each square representing 1 million people, to represent the 324 million people in the US (18x18=324) . Each slide labeled a subset of the grid, representing various groups of people who voted (or didn't vote). In the end, the red and blue squares showed that just 9% of the population voted for Hillary and Trump in the presidential primaries (ie, 91% didn't vote for them ... most didn't vote for anyone). Here is one of the final slides.


I really liked their slideshow, but it was missing a graph that showed all the groups labeled at the same time. So I decided to create one myself using SAS software.

This might surprise you, but I used some code I had recently written to create the following Pokémon Pikachu picture, to create my voter grid graph:


This Pikachu is a grid of data-driven colored squares, therefore it only required minimal changes to repurpose the SAS code to create the voter grid graphic. I changed the dimensions of the grid, added some code to annotate borders & labels around certain groups, and in almost no time I had my own graph. Try clicking the image of my graph below, to see the interactive version with mouse-over text that provides more details about each group:



Do you think the grid graph is appropriate for analyzing this type of data? What are the advantages and disadvantages? What other kinds of graphs might be better, for answering specific questions?


Post a Comment

Walking the line, between politics and religion

I usually try to avoid political or religious debates ... but as an impartial data analyst, it is possible to analyze data about something, without entering into the debate. In this blog post, I try to walk that fine line, and analyze data about the political leanings of religious groups in the US.

To get you in the mood for this blog post, here's a photo showing a few items from my friend Reggie's extensive collection of religious antiques. Over the years, I'm sure he's collected enough religious items to fill a warehouse!



I was recently perusing a Pew Research study about U.S. religious groups and their political leanings, and saw the following graph. The graph caught my attention, and I thought I "got it" ... but later I realized I hadn't really gotten what they were trying to show.


At first glance, I thought the graph showed which religion leaned most Republican and most Democrat (with an emphasis on Republican, since it seemed to be sorted that way). Upon further examination, I noticed that the graph was actually sorted by the difference between the Republican & Democrat leanings (the column of numbers to the right of the bars). I suppose that's an interesting number, but it seems like that's only important in situations like when you're analyzing a whole state and trying to determine which way it will go in a presidential election.

So I started working on my own graph, hoping to show what is actually important about the data (at least in my opinion). Here are some of the changes I made:

  • I get rid of the difference column, and sort the bars by the percent political affiliation values.
  • I generate three separate graphs - one for each political affiliation. By comparison, the Pew graph seems to be biased towards showing the Republican data well, but makes it more difficult to analyze the data for the other political parties.
  • I right-justify the religion names, so it's easier to see which bars they go with.
  • I normalize the values for each bar, so the bars visually go to 100, and form a nice straight non-distracting edge  (whereas they used integer values, and sometimes the values don't add up to 100%, and therefore the bars look a little jagged along the right).

Here are my bar charts - if you want to see more details, click the images below to see the interactive graphs, with mouse-over text for each bar:






How do you like the new graphs? Do you think religious affiliation of voters will play a major role in the upcoming political election? What other information could perhaps be combined with the data in the graphs to make them more useful in helping predict the impact on the election? So many questions!

I'll wrap it up with a random photo of a church that my friend Joy saw on one of her trips. Can you guess the religion and probable political leanings of the people who might attend to this church? (leave your guess in a comment!)



Post a Comment

Beyond the Credential: Testing test questions

ertification, exam, beyond the credential, sas, testing test questions, pvalue, discrimination index, writing test questions, SMEAny subject matter expert (SME) who has written test questions will tell you it is a lot harder and much more time consuming than they initially expected. Many believe they can simply write down any topic-related question that pops in their mind. Easy, right? Not so fast. Questions should be relevant to a defined job role, correctly map to a specific test objective, be written at the appropriate level of difficulty, and ensure the right answer is always right and the wrong answers are plausible, but wrong. Why go through all this work? Because each question should do its job in helping to determine if a candidate is qualified to pass the test.

Overall passing scores do not provide insight into how each question is performing but individual question statistics can. Below are two sets of statistics I use to see how well a specific test question is performing.

Read More »

Post a Comment

How can you get students and kids excited about learning SAS?

studentsLearningOur big data problems have only just begun; new studies have confirmed that SAS is the most valuable career skill in the increasingly competitive job market, and the analytics skills gap is a big issue that education at a grassroots level could help solve.

It’s more important now than ever to give kids the tools they need to succeed in our fast-paced world. Everyone wants to give their kid a leg up before (and during) college, especially if learning can be easy. And… who says learning can’t be fun?

The key is to make learning relatable. Analytics can often be seen as a complicated, abstract topic that might intimidate students, but A Recipe for Success Using SAS® University Edition: How to Plan Your First Analytics Project by Sharon Jones, Ed.D.  maps SAS to a relevant metaphor (cooking) and provides fun case studies that students can re-create on their own terms and tailor to their own interests. With this book, students of all ages can get up and running in no time with a free download of SAS® University Edition. A Recipe for Success Using SAS® University Edition walks readers through the basics of SAS® Studio, then shows them how to use basic programming and data collection to re-create projects from real students who made a difference in their schools and communities by using SAS. Read More »

Post a Comment

Analytical problem solving with Carlos


Carlos Pinheiro, PhD, Principal Analytical Training Consultant, SAS

Carlos Pinheiro is talented. One afternoon Carlos stood in front of a group of marketers including myself, and shared social network analysis using our data.  Yes, the analysis took talent, but the real brilliance was that Carlos presented the information in a way where we all left the meeting with an understanding of his analysis, and actions we could take to improve our business.  I don’t understand a lot of things…the movie Inception, the French language or how ships are made in a bottle, but I now have a basic understanding of social network analysis thanks to Carlos.

Carlos has a new course, Analytical Approaches to Solving Problems in Communications and Media, that will be making its debut at Analytics Experience 2016 in Las Vegas. I recently caught up with Carlos to ask few questions about his insight into trends in the industry and his new course.

Q:  As a data scientist, what kinds of businesses problems have you seen in your work in communications and media?
A:  While there are various problems we can face to in the industry, there are also plenty of opportunities. Traditional business problems like preventing churn and boosting cross-sell and up-sell opportunities continue to be points of focus. Companies have ongoing processes in place for these actions, with fraud as a recurrent topic. Businesses can save tens of millions of dollars by detecting fraud early in both usage and subscription scenarios. Ultimately, it’s a highly competitive market and companies in this industry need to understand the complexity of their customers’ behavior.

Read More »

Post a Comment

How can I learn SAS?

42-46618917What would come to mind if you were told that you have to attend a SAS training course? Perhaps, you have a vision of an instructor sitting at the front of a classroom, endlessly reading PowerPoint slides verbatim in a monotonous voice while you sit there baffled, wondering what is going on and are counting the minutes to the next break? For some, particularly SAS newbies, being told to learn code or a technical interface isn’t exciting and it can be daunting to learn a new skill.

When I started to use SAS (prior to joining SAS), I learned, like most “on the job”. My colleagues shared their knowledge, I studied their programs, I searched the internet and I referenced SAS books. My skills grew but so did my adoption of their interpretation of how the syntax worked and their coding habits. (Looking back, their interpretation wasn’t always accurate and their habits not always good.) Still, back then I thought I was the next best SAS programmer.

Then I joined SAS and on my second day attended SAS Programming 1: Essentials (now a free e-Learning course). I actually thought to myself when I was told that I would be attending the course, “why do I need to attend, I know all this already”. True, I did recognize the majority of the syntax but the explanations around how SAS works, the syntax rules and how code is processed, gave me numerous epiphany moments “so that’s why that would always go wrong”, “I wish I had known this before now” and so on. Read More »

Post a Comment

Most efficient way to find rare Pokémon

Were you the kid who sat there analyzing the amusement park map before entering the park, planning out how you could visit the most rides in the least amount of time? If so, then this blog's for you, my data analyst kindred spirit!

And to get you in the mood, here's one of those amusement park maps. This is a map of Carowinds, the park I went to often when I was a kid:



Some might call it overkill, but if you are serious about your Pokémoning then you will want to use every tool at your disposal - including the most powerful analytics software in the world! In this blog post, I show you how to use SAS software to optimize your search for rare Pokémon.

Let's say you've been playing for a while, and only have 15 of the more rare Pokémon left to catch. While you could try hatching them from eggs, you've decided to try a more analytical approach. You have perused all the online forums and compiled a list of possible sightings of these rare Pokémon, and then determined the closest location of each one. You can now plot these on a map using SAS' GMap Procedure: Read More »

Post a Comment