
SAS' Ali Dixon and Mary Osborne reveal why a BERT-based classifier is now part of our natural language processing capabilities of SAS Viya.
SAS' Ali Dixon and Mary Osborne reveal why a BERT-based classifier is now part of our natural language processing capabilities of SAS Viya.
Editor's note: This article follows Curious about ChatGPT: Exploring the origins of generative AI and natural language processing. As ChatGPT has entered the scene, many fear and uncertainty have been expressed by those working in education at all levels. Educators worry about cheating and rightly so. ChatGPT can do everything
How did we get to a place where a conversational chatbot can quickly create a personalized letter? Join us as we explore some of the key innovations over the past 50 years that help inform us about how to respond and what the future might hold.
SAS' Kirk Swilley and Tom Sabo showcase how you can use perform text analysis on minimal structured narrative data to spot patterns of possible human trafficking.
Imagine trying to dig a useful bit of information out of 50,000 lines of a chat log? Now, imagine if that needle in the haystack was the difference in a criminal being arrested or staying at-large? Thousands of lines of confusing and unreadable chat text are more and more frequently
Using such features and Natural Language Processing capabilities like text parsing and information extraction in SAS Visual Text Analytics (VTA) helps us uncover emerging trends and unlock the value of unstructured text data.
To find exact duplicates, matching all string pairs is the simplest approach, but it is not a very efficient or sufficient technique. Using the MD5 or SHA-1 hash algorithms can get us a correct outcome with a faster speed, yet near-duplicates would still not be on the radar. Text similarity is useful for finding files that look alike. There are various approaches to this and each of them has its own way to define documents that are considered duplicates. Furthermore, the definition of duplicate documents has implications for the type of processing and the results produced. Below are some of the options. Using SAS Visual Text Analytics, you can customize and accomplish this task during your corpus analysis journey either with Python SWAT package or with PROC SQL in SAS.
Corpus analysis is a technique widely used by data scientists because it provides an understanding of a document collection and provides insights into the text.
It is increasingly possible to use text analytics to explore different types of data. When a news story this summer caught my eye, I decided to see if I could use SAS Visual Text Analytics (VTA) and SAS Visual Analytics (VA) on customer complaints to provide information that might be
With the release of SAS Viya 2020.1.4, text categories and concept models can now be deployed into production with just a few clicks and used to score data in-batch and via API! You can also now use these models in decision flows.
Text analytics: Theoretically, a telecoms company, say, could use topic modelling to look at product reviews divided into themes.
The Text Investigation Framework is a flexible solution for addressing text challenges across several domains. It was designed to create a process for turning unstructured text data into a decisioning system.
Les institutions gouvernementales que ce soit pour la défense, les transports, les services publics, la sécurité, ou les soins de santé ont un défi et une opportunité à traiter : donner un sens à d'énormes volumes de textes non structurés qui ne font que croître. Plus de 80 % de
The Text Investigation Framework utilizes several technologies built on SAS Viya, including SAS Visual Text Analytics, SAS Visual Data Mining and Machine Learning, and SAS Visual Investigator. SAS Visual Investigator acts as the orchestrator to surface the results. With its broad set of capabilities, SAS Visual Investigator can perform scenario authoring, alert generation and disposition, and comprehensive workflow to gather vital outcomes and feedback.
I think that this pandemic has put digital transformation at the top of every executive agenda.
Critics of sports analytics (and there are some entertaining ones) love to point out that analytics isn’t capable of capturing the things that don’t show up on a box score. A player who dives on the floor to save a loose ball, a quarterback strategically misleading a defender to free
At the end of March, the German government sponsored a hackathon called #WirVsVirus. The aim was to bring Germany’s collective coding expertise to bear on some of the many problems surrounding COVID-19. In total, more than 27,000 coders joined the challenge, working from home, and programming for 48 hours from
Natural Language Processing can offer invaluable benefits to councils and increase resident satisfaction.
A major UK insurance company used text analytics to categorise complaints.
Analyzing tweets is challenging because of their succinctness (max 280 characters). However, that task is facilitated by the powerful features of SAS Visual Text Analytics (VTA), which includes embedded machine learning algorithms.
Which measures financial services can take to keep their customers complaints at a minimum.
Generating a word cloud (also known as a tag cloud) is a good way to mine internet text. Word (or tag) clouds visually represent the occurrence of keywords found in internet data such as Twitter feeds.
If you consume NBA content through social media, then you know just how active that online community is. Basketball arguments and ‘hot takes’ on the Internet are about as commonplace as Michael Jordan playing golf instead of running a functional NBA front office. I wondered if NBA fans happened to
Et si, en dehors de la nouvelle organisation des moyens de production, la 4ème révolution industrielle induisait également une évolution significative dans la gestion de la connaissance intrinsèque à chaque domaine ? Et si les nouvelles technologies numériques permettaient aux acteurs opérationnels d’accéder simplement à cette connaissance, le plus souvent fruit de méthodes
Natural language understanding (NLU) is a subfield of natural language processing (NLP) that enables machine reading comprehension. While both understand human language, NLU goes beyond the structural understanding of language to interpret intent, resolve context and word ambiguity, and even generate human language on its own. NLU is designed for
Recently, the North Carolina Human Trafficking Commission hosted a regional symposium to help strengthen North Carolina’s multidisciplinary response to human trafficking. One of the speakers shared an anecdote from a busy young woman with kids. She had returned home from work and was preparing for dinner; her young son wanted
Structuring a highly unstructured data source Human language is astoundingly complex and diverse. We express ourselves in infinite ways. It can be very difficult to model and extract meaning from both written and spoken language. Usually the most meaningful analysis uses a number of techniques. While supervised and unsupervised learning,
The Special Olympics is part of the inclusion movement for people with intellectual disabilities. The organisation provides year-round sports training and competitions for adults and children with intellectual disabilities. In March 2019 the Special Olympics World Games will be held in Abu Dhabi, United Arab Emirates. SAS is an official
There is tremendous value buried text sources such as call center and chat dialogues, survey comments, product reviews, technical notes, legal contracts... How can we extract the signal we want amidst all the noise?
Amidst the growing popularity of modern machine learning and deep learning techniques, one of the biggest challenges is the ability to obtain large amounts of training data suitable for your use case. This post discusses how the analytical approach for Named Entity Recognition (NER) can help.