SAS' Kirk Swilley and Tom Sabo showcase how you can use perform text analysis on minimal structured narrative data to spot patterns of possible human trafficking.
SAS' Kirk Swilley and Tom Sabo showcase how you can use perform text analysis on minimal structured narrative data to spot patterns of possible human trafficking.
Using such features and Natural Language Processing capabilities like text parsing and information extraction in SAS Visual Text Analytics (VTA) helps us uncover emerging trends and unlock the value of unstructured text data.
To find exact duplicates, matching all string pairs is the simplest approach, but it is not a very efficient or sufficient technique. Using the MD5 or SHA-1 hash algorithms can get us a correct outcome with a faster speed, yet near-duplicates would still not be on the radar. Text similarity is useful for finding files that look alike. There are various approaches to this and each of them has its own way to define documents that are considered duplicates. Furthermore, the definition of duplicate documents has implications for the type of processing and the results produced. Below are some of the options. Using SAS Visual Text Analytics, you can customize and accomplish this task during your corpus analysis journey either with Python SWAT package or with PROC SQL in SAS.
Corpus analysis is a technique widely used by data scientists because it provides an understanding of a document collection and provides insights into the text.
With the release of SAS Viya 2020.1.4, text categories and concept models can now be deployed into production with just a few clicks and used to score data in-batch and via API! You can also now use these models in decision flows.
The Text Investigation Framework is a flexible solution for addressing text challenges across several domains. It was designed to create a process for turning unstructured text data into a decisioning system.
The Text Investigation Framework utilizes several technologies built on SAS Viya, including SAS Visual Text Analytics, SAS Visual Data Mining and Machine Learning, and SAS Visual Investigator. SAS Visual Investigator acts as the orchestrator to surface the results. With its broad set of capabilities, SAS Visual Investigator can perform scenario authoring, alert generation and disposition, and comprehensive workflow to gather vital outcomes and feedback.
Analyzing tweets is challenging because of their succinctness (max 280 characters). However, that task is facilitated by the powerful features of SAS Visual Text Analytics (VTA), which includes embedded machine learning algorithms.