Word clouds have been available in SAS Visual Analytics for a while now, but recently, sentiment analysis was added to their functionality.
For those of you not familiar with word clouds, a word cloud, also known as a tag cloud, is a visual representation of text data. You are probably seeing one or more word clouds every day when you peruse the web as they are increasingly being added to web pages. If you look at the right side of this web page that contains this blog, you will see a word cloud (Tags) that shows the most frequent topics. You can see by their size that SAS Global Forum and SAS Administrators are some of the most frequently blogged-about topics.
In the age of social media, blogs, tweets, online reviews, ratings and recommendations, the ability to take unstructured data and analyze it for sentiment is key to a competitive advantage. Being able to analyze this data, understand customer’s opinions about various products and services, filter out the noise and find relevant content that can be acted upon are some of the advantages of using sentiment analysis.
To see how the new sentiment analysis works, let’s start by creating a creating a new word cloud. I’ll use data from our some of our SAS employee blogs at blogs.sas.com to illustrate.
Creating the word cloud
I downloaded a .csv file that contains information on the topics that SAS employees have been sharing with their peers through April 27, 2015. I imported the .csv file into SAS Visual Analytics and created a word cloud in the SAS Visual Analytics Explorer. Because I imported the data, by default, it was loaded to the Public LASR Server, so I made sure to enable the English stop list on the Public LASR Server.
Stop lists. Stop lists are used in text analysis to remove common words from the analysis and filter out noise. In the Properties tab for the word cloud, I entered 6 for the Maximum topics. Below is the initial word cloud, and you can see that for the first topic (analytics, social, +business, +medium), the words analytics, business, social and medium are the most important terms in the topic. No surprise that they also end up being the terms used to describe the topic!
Stemming. The + sign in front of a topic term indicates that stemming is being used. Stemming consolidates all forms of a word into one term. For example, the terms “make”, “makes”, “made” and “making” would be stemmed to +make). Stemming along with a stop list makes for a more concise word cloud. So this initial view of the word cloud shows us the different topics that were identified, and then within each topic, shows us the most relevant terms within each topic.
Enabling sentiment analysis
Now let’s enable sentiment analysis. This task is just a matter of selecting a checkbox on the Properties tab of the word cloud. Here’s the updated word cloud, and you can see that the topic list is now colored by sentiment (green = positive, yellow = neutral and red = negative). In the upper right of the word cloud, you see a reference to how many documents are considered to be positive (406), neutral (1,687) and negative (193) for the selected topic.
Working with the analysis
If you click on a term in word cloud, the details table opens at the bottom of the Word cloud. It shows each document in the topic that contains the selected term, the sentiment for the document, the relevance and any other fields that you assigned to the Document details role. In the screenshot below, the term business is selected and the details table has been expanded vertically. There are 225 documents that contain the term business.
Now that the details table is open with the term business selected, we can filter by sentiment just by checking or unchecking the boxes in the upper right of the details table. The screenshot below shows that only the positive box is checked and that there are 33 documents in the topic that have positive sentiment.
In the real world, by focusing on either the positive or negative sentiment documents, it should become evident what customers like or don’t like about a product. Armed with this information, businesses can either find opportunities for new products or fix issues that customers might have with their products and services. I think you’ll agree that the addition of sentiment analysis within SAS Visual Analytics is another great example of how SAS is democratizing data.