Tales from SAS author, Ron Cody

It was about 30 years ago that I wrote my first book, Applied Statistics and the SAS® Programming Language.  It was written on a PC with two floppy disk drives (one for the operating system, the other for my document) using a word processing software called WordStar.  It was all written in a monospaced font and I had to supply the publisher (then Elsevier, later bought by Prentice Hall) with camera-ready copy.  It was typed at the amazing rate of 30 characters per second on an IBM electric typewriter.

Believe it or not, this book is now in its fifth edition and still being used in a number of colleges.  Here I am, 30 years later, having just completed my tenth book called An Introduction to SAS® University Edition that is due out September 22nd.  I wrote it on a laptop computer with a 440 gigabytes solid-state drive and a CPU that has many times the processing power of the IBM 370/168 mainframe that I used back then to run SAS.

Have you thought of writing a book about SAS of your own?  Perhaps I can tell you something about the satisfaction in writing a book and also, to be honest, about the hard parts.  I have been a teacher for most of my adult life.  I love teaching.  Writing a book is a type of teaching also.  There is tremendous satisfaction in having someone come up to me, usually at a SAS conference, and tell me that they learned how to program in SAS from one of my books. Read More »

Post a Comment

Reading Hierarchical Data - Part 3

This post is the third and final in a series that illustrates three different solutions to "flattening" hierarchical data.  Don't forget to catch up with Part 1 and Part 2.

Solution 2, from my previous post, created one observation per header record, with detail data in a wide format, like this:

Detail Approach: One observation per header record
Obs    Family     Employee    Spouse    Child1    Child2    Child3
1     Jones       Bob        Carol     Sally     Alice
2     Sanchez     Mary
3     Smith       Nancy      Harold

Today's Solution 3, unlike Solution 2, has no arbitrary limit to the number of detail items, because it stores the detail data in a tall, rather than wide, format, as shown below, with one observation per detail record, rather than one observation per header record.  Read More »

Post a Comment

The one piece of advice everyone in analytics needs to hear

conversationI was recently asked why I would recommend my new class, Explaining Analytics to Decision Makers:  Insights to Action.  The answer goes back to some great advice, a lunch of eggplant parmesan and in another more twisted way, to what was ironically affectionately known as the “bomb plant.”

Early in my career I was working for a large company on a project.  On this project was Bill, a well-respected, seasoned professional.  It was known Bill was a year or two from retirement.  Far from waiting out retirement, this gentleman was floating from project to project with little pressing responsibility and offering advice where he could.  I knew of his reputation as a respected engineer and was pleased that through a good friend I had gotten to know him.  Bill had heard that I was considering leaving my current position and starting an independent consulting practice to provide analytical support to a variety of clients.  (This was well before today’s market and was more of a stretch than such a plan would be today.)  He suggested we meet for lunch to discuss my plans.

Adapting to changes

Over lunch as we discussed my plans, he reflected on his long career.  After growing up in east Georgia, he found himself misplaced both psychologically and professionally by the development of the Savannah River Plant.  Affectionately known by the locals as the “bomb plant”, the building of the plant changed the region.  Following a theme of all great southern writers, it was the backdrop of a changing South and the undercurrent of life captured by Pat Conroy in The Prince of Tides.

But Bill’s story was direct.  Life as he knew it, based on a tight circle of life long contacts, had changed.  He had found himself in the big city and had to live his off engineering skills.  He talked about how his life changed and how he had to change.  I listened intently as he was easy to listen to.  He imparted his expansive knowledge with an easy manner.  I was not aware he was leading up to a life changing moment for me.

Selling yourself is key

Read More »

Post a Comment

Reading hierarchical data - Part 2

This post is the second in a series that illustrates three different solutions to "flattening" hierarchical data.

Solution 1, from my previous post, created one observation per header record, summarizing the detail data with a COUNT variable, like this:

Summary Approach: One observation per header record
Obs    Family     Count
 1     Jones        4
 2     Sanchez      1
 3     Smith        2

Solution 2, illustrated in today's blog, creates one observation per header record, like Solution 1, but replaces the COUNT variable with detail data, in a wide format, like this:

Detail Approach: One observation per header record
Obs    Family     Employee    Spouse    Child1    Child2    Child3
 1     Jones       Bob        Carol     Sally     Alice
 2     Sanchez     Mary
 3     Smith       Nancy      Harold

Read More »

Post a Comment

Reading hierarchical data - Part 1

FamilyA family and its members represent a simple hierarchy.  For example, the Jones family has four members:


A text file might represent this hierarchy with family records followed by family members' records, like this:



The PROC FORMAT step below defines the codes in Column 1:

proc format; 
   value $type

Read More »

Post a Comment

Flexibility of SAS Enterprise Miner

analyticsClassDo you use an array of tools to perform predictive analytics on your data? Is your current tool not flexible enough to accommodate some of your requirements? SAS Enterprise Miner may be your solution.

With growing number of data mining applications, having a tool which can do variety of analysis is just not enough. Some situations require an open extensible design that provides ultimate flexibility and personalization so that users can tailor their experience according to their needs.

The flexible architecture of SAS Enterprise Miner opens an entire world of SAS to data miners and data scientists with a variety of skill levels, ranging from business users to technical experts.

What can users achieve by SAS Enterprise Miner’s flexible architecture?

When there are situations where you want to customize the functionality based on the business requirements, the SAS code node and extension node come to your rescue. SAS Enterprise Miner’s SAS code node enables users to incorporate new or existing SAS code into the process flow. SAS code node extends the functionality of SAS Enterprise Miner by making SAS procedures available in data mining analysis. One can also create custom extension nodes using SAS code and XML logic and share it with others across the enterprise. Diagrams can be shared easily with other analysts throughout the enterprise. Read More »

Post a Comment

How is electricity generated in your state?

I recently saw an article on washingtonpost.com showing what methods are used to generate electricity in each state. The data was interesting enough that I decided to try my hand at graphing and mapping it with our SAS software. Read along to see what I kept the same, and what I changed...

But before we get started, here's a fun picture of my friend "Magic Wanda" at my Halloween party. I'm sure it was just an oversight that the article did not include witchcraft & sorcery as methods used to generate electricity! ;)



And now, on to the graphs!...

Here's a screen capture of the main graph in the washingtonpost.com (wp) article. It's a pretty cool interactive graph, and when you click on the colored bar segments or the legend, it brings the selected electricity source to the top and sorts the bars by the selected source.
Read More »

Post a Comment

What areas do venture capitalists invest in medical research?

The Wall Street Journal recently published a study of the top 17 medical areas (or body parts) that venture capitalist investments are likely to benefit. They used graphs to summarize the results, but "the graph guy" in me just couldn't resist trying to improve them. Did my improvements help? - You be the judge!

Before we get started in the data analysis, I want to take a minute to point out how fortunate we are to live at a time when medical technology is so advanced. For example, the first successful long-term lung transplants took place in the 1980s ... and today we have a member on our dragonboat racing team who has had both lungs replaced. Can you tell which team member it is?


And now, on with the graph makeover! ... Read More »

Post a Comment

Hadoop releases - here's the timeline graph!

There's a lot of buzz about Hadoop these days. I started checking into it, and there seemed to be a gazillion releases. So, being The Graph Guy, I decided to create a graph to make it a little easier to digest!

During my search for Hadoop information, I found the Apache page showing all the releases. As I scrolled down through page after page of releases, I found it difficult to get a grasp on things - there seemed to be multiple versions releasing simultaneously.

I didn't want to have to work very hard to understand Hadoop releases - I just wanted an "Easy Button." And when your favorite tool is SAS, your easy button often looks a lot like a custom graph. :)

I examined the html code behind the Hadoop release page, and found that all the releases had a consistent 'header' line that I could search out and parse programmatically. Here's an example:


Read More »

Post a Comment

The world's most valuable sports teams

There's big money in professional sports these days - we're talking billions of dollars! Do you know which teams are the most valuable? The graphs in this blog will show you...

I recently saw a bar chart on dadaviz.com showing the world's most valuable sports teams. It was the right kind of graph for this type of comparison, and it showed interesting data ... but their use of color really didn't work for me. Here's a screen-capture of their graph. Try to pick a color in the legend (such as Football or Formula1) and quickly identify all those colored bars in the graph - I bet you can't!



So I found the data source (forbes.com), entered the data into a SAS dataset, and created my own version of the graph. I kept the layout the same as the original ... but instead of showing all the colors together, I created a separate graph for each sport. Read More »

Post a Comment