Every day, military intelligence analysts sit behind computers reading a never-ending stream of reports, updating presentation templates and writing assessments. But intelligence is more than documenting events and sharing breaking news. It involves understanding and predicting complexities in human behavior across various organizational constructs and using facets of information to make assessments on past, current and future circumstances. The availability of both human and technological resources has a large affect on the quality of these assessments.
For the past two decades, technology has advanced at an incredible rate. Yet the technology available to intelligence analysts has not comparably progressed. This is due to a combination of tedious software acquisition processes, lack of requisite data science skill sets and a reluctance to make systemic changes to intelligence training and processes.
Military intelligence schoolhouse content at the entry level is heavily focused on procedural manuals and briefing skills. Very little curriculum is dedicated to the process of analyzing data and making assessments. More advanced courses introduce structured techniques like analysis of competing hypothesis, but rarely are service members enrolled in these courses able to practically implement techniques so far removed from operational demands.
Analysts still spend much of their day data mining and reporting rather than thinking critically about information and making in-depth assessments. It’s the nature of the job and reflects the expectations from consumers of the information they provide. It is difficult to hone analytic skills when a primary expectation is that you produce a daily significant activity slide.
Every analyst has their go-to data mining tool, but very few military analysts are leveraging the power of natural language processing (NLP) and machine learning. These two capabilities can sharpen an analyst’s focus from data mining to data discovery, and support deeper analysis.Read my previous post: 13 ways to use AI in military intelligence
NLP and artificial intelligence
NLP is a branch of artificial intelligence that helps computers understand, manipulate and interpret human language. Analyst firms that study technology trends and evaluate best practices have found that the best way to analyze large volumes of unstructured data is through a hybrid approach of NLP, machine learning and human generated rules. Rules are what the intelligence community is most familiar with. In this case, the return of information to the analyst is only as good as the rules that are written, and the quality of written rules often depends on the analyst’s experience.
For ease of explanation, here's a simple example of a rules-based approach a military analyst may use to monitor improvised explosive device (IED) activity. In this example, the analyst is explicitly telling the machine what to find and the machine will return exact matches for documents that match this query. As written, this query would return documents where various types of IEDs are found within 10 words of terms representing military forces like vehicle, patrol or unit.
But this query would not return any searches that include abbreviations for various types of IEDs such as VBIED, RCIED, PBIED, PPIED or CWIED because those possibilities aren’t accounted for in the rule. A seasoned analyst can easily point out additional flaws with using this query as written, but the focus of this article is how NLP and machine learning can supplement analyst efforts to efficiently sift through data and provide more time for analysis.
Natural language processing tasks
To better understand the benefits that NLP and machine learning can provide military analysts, let's look at some of the specific NLP tasks that can go beyond the manual, rules based approach.
Automatic rule generation
NLP and machine learning can significantly reduce the time of writing rules like the IED example above. Probabilistic semantics is a semi-supervised machine learning method that automatically generates rules. An analyst can simply click on key words that show up in their discovery process and rules will be immediately generated to create a query or entity for persistent monitoring. The rules can then be edited if needed by human experts, but the bulk of the work is done by algorithms. This allows analysts to quickly explore additional search avenues and edit entities rapidly rather than spending a significant amount of time hand-tuning individual queries.
Machine learning algorithms require input vectors to be values rather than plain text. The language model for word embeddings creates numeric coordinates that place terms in space, clustering words close together that are used in similar ways. Synonyms, or like terms, should have similar coordinates. For example, terms commonly related to homemade explosives (HMEs) would be clustered similarly in documents, because they are used in comparable ways throughout reporting. Analysts don’t have to solely depend on the rules they have written to capture all relevant reporting on the topic. This is a very concrete way artificial intelligence can augment the process of intelligence analysis.
Sequence mapping is a semi-supervised machine learning method that can be used to automatically identify and extract entities such as people, places, organizations and currencies from documents or messages. It's also helpful in parsing, because it will detect different versions of a word without the analyst having to write specific rules about word variants. This means that entities identified by the machine as “PERSON” and any variations would be automatically labelled and extracted, because NLP provides a foundation for the machine to learn how a name is constructed in the human dialect it's analyzing. This is another example of how natural language processing can simplify the process of analyzing large amounts of data pouring in daily.
In 2010 and 2011, there were dozens of indicators and warnings being monitored to detect civil unrest and political instability, yet somehow the intelligence community failed to predict the Arab Spring and its successive revolutions throughout the region. The data being monitored was based on our bias of what indicates civil unrest and political instability.
Perhaps NLP and machine learning could have uncovered different trends that would have altered the rules-based indicators and warnings the intelligence community was tracking. For example, unsupervised machine learning for topic discovery uses mathematical models to detect term relationships across documents. The algorithms aren’t looking for specific guidance on what terms or relationships to extract, but instead pull out words that present often in proximity to each other.
This is one way artificial intelligence can address the conundrum of, “you don’t know what you don’t know,” by allowing analysts to uncover patterns that may not be captured in existing rules.
The military area of operations has changed from a battlefield to an operating environment that includes complex civil-military relationships and nation building efforts. Sentiment models can automatically extract positive, negative and neutral sentiment to provide input on local populace perspectives towards ongoing military presence and operations in a given area.
Sentiment, easily extracted with NLP, is different than emotion which requires a more rules-based approach. Automated sentiment analysis can add incredible value for the purposes of understanding populace opinions within a civil military environment. Depending on the use case, exploring concepts of emotion may be more appropriate to monitor at a micro level as it can reveal complexities around intent, motivators and deterrents.
Text summarization is another aspect of NLP that holds great potential for the intelligence space. Reporting formats are often standard and summaritive in nature. AI can automatically summarize content in a corpus of documents by extracting key themes.
The summary isn’t meant to be a substitute of analysis, but rather a recap of what is presented in a specific report. Generative summaries still largely rely on templates that fits into a workflow for standardized reporting. This would be most useful in describing what is contained within finished intelligence products to improve internal search capabilities and improve tagging systems for tracking finished products.
Adding NLP to military processes
Once analysts can efficiently get the right information to analyze, the next steps for making sense of the data and creating assessments should involve a combination of standard operating procedures, structured analytic techniques, human experience and multiple disciplines in artificial intelligence.
It's difficult to find a custom, off-the-shelf solution to address the intelligence community’s needs in AI due to the sensitive nature of intelligence, high operational tempo and rapidly changing priorities. There should be an investment in tools that support NLP and machine learning along with embedded data scientists who can work in tandem with analysts to extract critical information needed for military operations.
Few fields are tasked with sifting through and analyzing unstructured text more than intelligence. It's time to embrace the machine learning capabilities behind NLP and allow analysts to work smarter not harder.
Embedding data scientists into intelligence teams can strengthen the quality of data extraction and help highlight trends in the data for further exploration by analysts. Continued partnerships between data scientists and intelligence analysts can revolutionize how we interpret complex circumstances and assess future probabilities.Start planning for NLP and machine learning