I've known Jim Cox for a long time. He's the SAS R&D manager for SAS Text Miner, and a gifted singer. We almost never talk about work stuff, because Jim is waaaaay too smart for me.
That's why I was so pleased to discover Jim's series of blog entries about how text mining software can discover topics, and organize and assign weights to them, so that the topics can be analyzed in a useful way.
If you have even a remote interest in text analytics, or you just want the historical perspective on correlation, principal component analysis, and factor analysis -- you really need to read these entries. Here are links to the first three in the series. I'm looking forward to the fourth.
- The Whats, Whys, and Wherefores of Topic Discovery and Management: Part One
- WWW of Topic Management, part 2: What is a topic, and why?
- Part 3: Understanding Topic Discovery from an "historical" perspective
P.S. I note, with pride, that I also once blogged about Sir Francis Galton and his theories on measuring intelligence. Jim covers "Sir Frank" (as I call him) in part 3.