If you remember Andy Kuligowski for nothing else, you probably remember that he was the SAS Global Forum conference chair for 2012. I remember that he has a wonderful sense of humor, which he used more than once during his Hands on Workshop at the MidWest SAS Users Group conference recently. If you attended the workshop, you also know that he is a fantastic teacher and fluent in SAS.
It is a waste of a chair for me to attend the entire workshop, but I decided to stick around to see if could learn something.
He started by giving a little background that let attendees know why he thinks parsing is important. “When I first started at a company called Nielson – some 25 years ago – my first project was working on a little project where we were analyzing people’s subscriptions to paid TV networks. The cable company would send us a record of the bill of a household and then they would send us the record three months later – we compared it to see what they provided,” he explained.
“That worked for the great big companies really well. For the really, really little ones we couldn’t exclude them from the sample – that would be biasing – they sent a text file of the physical bills that they mailed.”
Kuligowski and his team had to find a way to read the files to get the information that they wanted from the bills. “We came up with a little routine called a parser, and we came up with the data ……. eventually.” (He probably thinks he invented the Internet, too. Hehehe!)
Parse – the analysis of a string of characters and subsequent breakdown into a group of components
Kuligowski showed a very simple example of parsing – one even I could grasp. Attendees were asked to open a SAMPLE file containing a SAS log. Then they were asked to open the find menu and search manually for the word Cary (always found at least once at the beginning of a SAS log). This exercise was manual ‘parsing.’
Here’s the part of his session that I stuck around for – a brief explanation of what a parser is. From here, Kuligowski taught attendees various techniques (all included in his paper) for getting information from data that comes in unusual formats.
Download his paper, Parsing Useful Data Out of Unusual Formats Using SAS, and all MWSUG 2012 proceedings, now!!