The machine learning autogenerated concept and fact rules in VTA 8.4 facilitate the process of developing LITI rules to extract and find information in text documents. There are many important problems where the use of Text Analytics provides valuable insights such as with Human Trafficking.
Tag: SAS Text Analytics
This blog shows how the automatically generated concepts and categories in Visual Text Analytics (VTA) can be refined using LITI and Boolean rules. I will use a data set that contains information on 1527 randomly selected movies: their titles, reviews, MPAA Ratings, Main Genre classifications and Viewer Ratings.
SAS Visual Text Analytics provides dictionary-based and non-domain-specific tokenization functionality for Chinese documents, however sometimes you still want to get N-gram tokens. This can be especially helpful when the documents are domain-specific and most of the tokens are not included into the SAS-provided Chinese dictionary. What is an N-gram? An
To demonstrate the power of text mining and the insights it can uncover, I used SAS Text Mining technologies to extract the underlying key topics of the children's classic Alice in Wonderland. I want to show you what Alice in Wonderland can tell us about both human intelligence and artificial
Community detection has been used in multiple fields, such as social networks, biological networks, tele-communication networks and fraud/terrorist detection etc. Traditionally, it is performed on an entity link graph in which the vertices represent the entities and the edges indicate links between pairs of entities, and its target is to
In 2011, Loughran and McDonald applied a general sentiment word list to accounting and finance topics, and this led to a high rate of misclassification. They found that about three-fourths of the negative words in the Harvard IV TagNeg dictionary of negative words are typically not negative in a financial
Recently a colleague told me Google had published new, interesting data sets at BigQuery. I found a lot of Reddit data as well, so I quickly tried running BigQuery with these text data to see what I could produce. After getting some pretty interesting results, I wanted to see if
In my last post, I showed you how to generate a word cloud of pdf collections. Word clouds show you which terms are mentioned by your documents and the frequency with which they occur in the documents. However, word clouds cannot lay out words from a semantic or linguistic perspective.
Last week, I attended the IALP 2016 conference (20th International Conference on Asian Language Processing) in Taiwan. After the conference, each presenter received a u-disk with all accepted papers in PDF format. So when I got back to Beijing, I began going through the papers to extend my learning. Usually, when
SAS Event Stream Processing (ESP) cannot only process structured streaming events (a collection of fields) in real time, but has also very advanced features regarding the collection and the analysis of unstructured events. Twitter is one of the most well-known social network application and probably the first that comes to
This is my second article about voice of customer analysis; you can find the first here. The first time we discussed that a simple sentiment polarity score was a rather a narrow view. This time we will examine a more insightful approach, using voice of customer analysis to monitor customers’ opinions
Don’t get me wrong. I have no doubt in the capabilities of our SAS products and SAS solutions! But I wanted to get a firsthand experience of our new solution for text analytics, SAS Contextual Analysis 14.1. And the result is very convincing! But let’s start from the beginning. Functions
This is the first of two articles looking at how to listen to what your customers are saying and act upon it – that is, how to understand the voice of the customer. Over the last few years, one of the big uses for SAS® Text Analytics has been to
Is cognitive computing an application of text mining? If you have asked this question, you are not alone. In fact, lately I have heard it quite often. So what is cognitive computing, really? A cognitive computing system, as stated by Dr. John E. Kelly III, is one that has the
Hi, there! First of all, let me introduce myself, as this is my first blog. I am Simran Bagga, and three weeks ago I became the Product Manager for Text Analytics at SAS. This role might be new to me, but text analytics is not. For the past 12 years,
In today’s world of instant gratification, consumers want – and expect – immediate answers to their questions. Quite often, that help comes in the form of a live chat session with a customer service agent. The logs from these chats provide a unique analysis opportunity. Like a call center transcript,
Recently, I have been thinking about how search can play more of a part in discovery and exploration with SAS Text Miner. Unsupervised text discovery usually begins with a look at the frequent or highly weighted terms in the collection, perhaps includes some edits to the synonym and stop lists,
Analyzing text is like a treasure hunt. It is hard to tell what you will end up with before you start digging and the things you find out can be quite unique, invaluable and in many cases full of surprises. It requires a good blend of instruments like business knowledge,
The benefits of big data often depend on taming unstructured data. However, in international contexts, customer comments, employee notes, external websites, and the social media labyrinth are not exclusively written in English, or any single language for that matter. The Tower of Babel lives and it is in your unstructured
When I ask people what they know about Denmark they often mention Hans Christian Andersen. He was born in Denmark in 1805 and is one of the most adored children’s authors of all time. Many of his fairy tales are known worldwide as they have been translated into more than
~ This article is co-authored by Biljana Belamaric Wilsey and Teresa Jade, both of whom are linguists in SAS' Text Analytics R&D. When I learned to program in Python, I was reminded that you have to tell the computer everything explicitly; it does not understand the human world of nuance
Today’s natural language processing (NLP) systems can do some amazing things, including enabling the transformation of unstructured data into structured numerical and/or categorical data. Why is this important? Because once the key information has been identified or a key pattern modeled, the newly created, structured data can be used in
A super hot topic in most organizations is how to make the most of the troves of social data available. This Post-It Note author isn't specific about the SAS solution that is being used, so I'm going to speculate that he or she is taking advantage of SAS Text Miner, SAS Text