Parsing useful data out of unusual formats

1

Andy Kuligowski, SAS Global Forum 2012 Conference ChairIf you remember Andy Kuligowski for nothing else, you probably remember that he was the SAS Global Forum conference chair for 2012. I remember that he has a wonderful sense of humor, which he used more than once during his Hands on Workshop at the MidWest SAS Users Group conference recently. If you attended the workshop, you also know that he is a fantastic teacher and fluent in SAS.

It is a waste of a chair for me to attend the entire workshop, but I decided to stick around to see if could learn something.

He started by giving a little background that let attendees know why he thinks parsing is important. “When I first started at a company called Nielson – some 25 years ago – my first project was working on a little project where we were analyzing people’s subscriptions to paid TV networks. The cable company would send us a record of the bill of a household and then they would send us the record three months later – we compared it to see what they provided,” he explained.

“That worked for the great big companies really well. For the really, really little ones we couldn’t exclude them from the sample – that would be biasing – they sent a text file of the physical bills that they mailed.”

Kuligowski and his team had to find a way to read the files to get the information that they wanted from the bills. “We came up with a little routine called a parser, and we came up with the data ……. eventually.” (He probably thinks he invented the Internet, too.  Hehehe!)

Parser primer

Parse – the analysis of a string of characters and subsequent breakdown into a group of components

Kuligowski showed a very simple example of parsing – one even I could grasp. Attendees were asked to open a SAMPLE file containing a SAS log. Then they were asked to open the find menu and search manually for the word Cary (always found at least once at the beginning of a SAS log). This exercise was manual ‘parsing.’

Here’s the part of his session that I stuck around for – a brief explanation of what a parser is. From here, Kuligowski taught attendees various techniques (all included in his paper) for getting information from data that comes in unusual formats.  

Download his paper, Parsing Useful Data Out of Unusual Formats Using SAS, and all MWSUG 2012 proceedings, now!!

Share

About Author

Waynette Tubbs

Editor, Marketing Editorial

+Waynette Tubbs is the Editor of the Risk Management Knowledge Exchange at SAS, Managing Editor of sascom Magazine and Editor of the SAS Tech Report. Tubbs has developed a comprehensive portfolio of strategic business and marketing communications during her career spanning 15 years of magazine, marketing and agency work.

Back to Top