What happens when your data isn't arranged in rows & columns the way you need it? ... Well if you haven't guessed by now - SAS Software is the king of restructuring data!
Sunspot and solar flare activity are cyclical, and peak about every 11 years. Currently we are near the high point of the cycle, as shown in this SAS graph I created:
Would you like to know how to download the data, and create your own plot? ... I knew you would! :)
This example is a little more complex than the previous ones, but it teaches some really useful techniques you'll use over and over in your data munging career. Just take it one step at a time, and you should be fine! :)
First you'll need to download the raw data file table_international-sunspot-numbers_monthly.txt from the noaa.gov web site, and save it into C:\\SASUniversityEdition\\myfolders\\, so that you can get to it via the path /folders/myfolders/ in your SAS program. (See my previous blog post with details about how to set up this shared folder, if you haven't already done so.) While you're saving it, I recommend shortening the name to sunspots.txt.
The text file is structured with each year on a single line, and a column for each month's sunspot frequency count:
You can import the text file into SAS by running the following code:
data sunspots_raw; infile "/folders/myfolders/sunspots.txt" dlm=' ' firstobs=5; input year jan feb mar apr may jun jul aug sep oct nov dec; if (year ge 1749) then output; run;
You've now got the data in a SAS data set, but each month is in a separate column (or variable). You could plot 12 overlapping lines against year, but we really want them all together as 1 long time-series line. So let's transpose the data, and make the months become values, rather than variables.
proc transpose data=sunspots_raw out=sunspots_tran; by year; run;
Here's what your SAS dataset looks like now - we're almost there!
One last little detail - we need to combine the year & month into a single value. I chose to use an actual date value at the 15th of each month. To do this, I put together a string value on-the-fly (such as '15jan1749') and input that string using the date9. SAS format.
data sunspots_tran; set sunspots_tran; format date date9.; date=input('15'||trim(left(_name_))||trim(left(year)),date9.); run;
Your data now has an official date column, and looks like this:
Your data is now ready, and all you have to do is run the following code to plot it:
proc sgplot data=sunspots_tran; label date='Year' col1='Monthly Sunspots'; series x=date y=col1; run;
What other data can you find that might need to be transposed in order to plot it? If you find some, try importing and transposing it, and leave a comment letting everyone know how it turns out!