People have always been fascinated by sports statistics, and with the recent popularity of fantasy sports there is an increased demand for custom analyses of the sports data. With those folks in mind, I have created a simple example that SAS programmers can use as a starting point for analyzing NBA data.
Before we get into the nitty gritty details, here's a picture of my friend Simone's son playing basketball (in the bright shirt, with the ball). He's tall, in shape, and smart, so I think he'll do well in basketball. Who knows, maybe one of these days we'll all be plotting his data in our NBA graphs!
I had recently read about some examples that demonstrate how to use the Python and R programming languages to analyze the NBA data, and decided to try my hand at using SAS to do something similar. With a bit of digging, I found the magic url that can be used to download the data for a specified player & season. I then wrote some SAS code to import the data directly from the Web page, into a SAS dataset.
After scrutinizing the data a bit, I determined that the 0,0 origin coordinate was in the middle of the basket, and all of the shots were shown in relation to one end of the court. I looked up the dimensions of an NBA basketball court, then determined the coordinates of the 4 corners, and created a map polygon I could use to represent the court in Proc Gmap. I then converted the shot data into an annotate dataset that would plot the missed shots as red x's and the made shots as blue o's. Here's what things looked like so far:
The above graph is nice, but it would be even better with some points-of-reference so we can see 'where' the player was when he made the shot. Therefore I worked out the coordinates of all the markings on the court, and created a special annotate dataset to draw them on the map (using annotate draw and polygon functions). Wow - what a difference that makes!How to graph NBA data with SAS #analytics Click To Tweet
The Proc Gmap approach is a good starting place for a spatial analysis, but how about analyzing the data over time? It was a simple matter to feed the data into Proc Gplot, and generate the following. Do you notice any trends in Stephen's shot data? Can you explain the outliers?
Just for Fun:
Here's a little quiz, to test your NBA knowledge, combined with your visual analytics perception skills. Below are three graphs - can you tell which goes with Kevin Durant, Lebron James, and Marc Gasol:
(Once you've made your guess, you can 'cheat' and look at the filenames of the images for a hint!)