In my last post, I talked about why SAS utilizes a rotated Singular Value Decomposition (SVD) approach for topic generation, rather than using Latent Dirichlet Allocation (LDA). I noted that LDA has undergone a variety of improvements in the last seven years since SAS opted to use the SVD method. So, the
Author
When I talk with more analytically savvy users of SAS® Text Miner or SAS® Contextual Analysis, I inevitably get asked questions about why SAS uses a completely different approach to topic generation than anybody else and why should they trust the approach SAS adopts? These are good questions. I first
The first text analytics product SAS released to the market in 2002 was SAS® Text Miner to enable SAS users to extract insights from unstructured data in addition to structured data. In 2009, in quick succession, SAS released two new products: SAS® Enterprise Content Categorization and SAS® Sentiment Analysis. These
I mentioned last time that the technique we use to determine topics is a variant of something that has been around for fifty years. In this part I will talk about the intriguing history of this technique, and in the process, I hope to illuminate what we are doing and
Topic Modeling of documents is hot in the research community. Conferences are filled with different ways of determining topic models and how to apply them. The prestigious data mining conference KDD has in recent years had entire sections on topic modeling. The leading algorithms all have three-letter acronyms and sound
Those of you who have seen the new version of SAS Text Miner know that we are transitioning from a “document clustering” approach towards a “topics in documents” approach. I have received a lot of questions about this, so I thought I would address some of them in a multi-part