Text Mining - What You're Really Interested In

1

Terry Woodfield is teaching his new course, "Text Analytics with SAS Text Miner", at the upcoming M2010 Data Mining Conference. Terry has been a SAS instructor for more than 10 years and has attended several Data Mining Conferences. He took some time out of his busy schedule to answer a few questions about this course and M2010.

1. How does Text Analytics with SAS Text Miner differ from other Text Mining courses that SAS offers?

TW: Other text analytics courses for SAS Content Categorization and SAS Sentiment Analysis are under development and should be available soon. There is some overlap in text analytic areas. My course addresses general text analytic topics, but the course only addresses solutions that use SAS Text Miner.

2. Text analytics seems to be a hot topic this year. Can you tell us how this course addresses some of the latest trends in the field of text analytics?

TW: The definition of text mining and text analytics has changed over the years because of better algorithms and faster computers. However, the details of algorithmic and technical trends in text analytics are not all that exciting to a typical user. A user is not so interested in factor rotations of concept vectors derived using Latent Semantic Analysis. The user is more interested in how the software learns concepts and topics from document collections and uses the derived topics and concepts to characterize a document, either for exploration and discovery, or for predictive modeling. Forensic linguistics uses text mining to identify criminals like Ted Kaczynski, the Unabomber. Warranty analysis to satisfy the TREAD act uses text mining to find concepts and topics that are highly correlated with automotive warranty problems that can lead to serious accidents. Technical support call centers use text mining to develop methods to automatically route problems to an appropriate expert. The latest trends encompass both the technology and the application of text analytics. More and more companies are realizing that text analytics can significantly improve business decisions. The course examines the current major application areas and provides data and example analyses to illustrate how text mining can be used to solve real problems.

3. Who should attend this course?

TW: Anyone involved in analytics with access to textual data. Examples include: complaints or requests from call center contacts; descriptions of warranty problems; adjuster notes tied to insurance claims; physician reports tied to insurance claims or health studies; news reports collected from the Internet; customer requests posted on company customer support Web sites; adverse event reports in operations at nuclear power plants, chemical plants, or refineries; adverse event reports in the health sciences; adverse event reports in transportation; forensic evidence in the form of ransom notes, manifestos, or other voice or written communication; homogeneous document collections such as the MEDLINE medical abstracts.

4. Why should M2010 attendees consider taking this course?

TW: If you have access to textual data and you license SAS Text Miner or are considering licensing SAS Text Miner, you should take the course. Even if you just want to see what text analytics is all about, you should consider taking the course. As an added bonus, if you are going to be in Las Vegas all week, taking a two day course will give you an extra free day to lose more money in the casinos.

5. You’ve attended several Data Mining Conferences. What have been some of the highlights over the years?

TW: A few talks stand out in my mind. I enjoyed listening to Tom Mitchell of Carnegie Mellon University talk about the use of pattern recognition methods to detect tumors at M2003. I especially enjoyed Edward Wegman of George Mason University describe methods for visualizing neural networks and other complex predictive models at M2002. Herb Edelstein of Two Crows Corp. is always interesting and informative no matter what topic he is addressing. His talk at M2001 describing misconceptions and pitfalls in data mining was particularly good.

6. Which speakers are you looking forward to seeing at this year’s conference?

TW: I’ve always enjoyed listening to John Elder relate his experiences providing practical data mining solutions. Will Neafsey of Ford Motor Company gave an excellent talk a few years ago about how Ford tries to anticipate customer preferences. I am eager to hear what Mr. Neafsey has to say this year. Tim Rey of Dow Chemical brings to the conference perhaps the widest breadth of experience using analytics to solve real business problems. Cailyn Clark is an expert at applying Text Analytics to solve business problems, so it is not surprising that I don’t want to miss her talk. I always learn something from Russell Albright and Leonardo Auslender of SAS.

7. You’ve been an instructor for SAS for more than 10 years. What is your favorite part of your job?

TW: Standing in front of a trapped audience telling bad jokes and pretending to know what I am talking about.

Share

About Author

Michele Reister

Marketing Specialist

Michele Reister has worked in the Education Division at SAS since 2004. During that time she has played many roles including marketing training courses, developing product bundles, managing conferences and overseeing the division’s discount programs. Currently, she is responsible for the division’s social media strategy. Michele holds a BS in Management and Information Technology from Daniel Webster College and an MBA from University of North Carolina at Chapel Hill. Michele is a perpetual student herself and is constantly looking for better ways to serve SAS’ user population. When she’s not expanding her knowledge of marketing, Michele enjoys group fitness classes, cooking, volunteering, reading and chasing after her two children.

Back to Top