Predictive modelling solutions typically rely on the availability of good quality data and subsequent models which are estimated in advance using historical data. For example, a credit scoring framework to predict propensity of customer default requires the use of clean processed transactional and demographic data.
Datasets used for this type of analysis are first captured through conventional data gathering channels such as forms and third party databases and then processed to arrive in batch. This process assumes that scoring and alert generation is not time-critical. In other words, it is assumed that there is sufficient time (hours rather than minutes or even seconds) before the relevancy of action for an alert expires. It is also assumed that the data required about the individual or entity to be assessed will become available through conventional data gathering exercises such as a bank form submitted by the individual, agents or third party.
What about time dependent situations?
Whilst in credit scoring these assumptions hold true, there are situations where the above assumptions are unrealistic. Risk assessments in Customs, border control, front-line policing operations and even anti-money laundering operations are all time dependent situations where the action window of an alert expires in hours or minutes. In addition, the data required to risk assess an entity are not always collected through conventional methods. For example, a terrorist will not submit a form with their details. Similarly travellers with goods entering a customs room are very unlikely to submit a long form providing details of the content of their luggage. Furthermore, border controls typically rely on information from passports. Individuals’ expressions, movements, the tone of their voice and even body language all carry valuable hints of information which can considerably enhance subsequent risk assessments.
Leveraging image processing and related AI models can considerably enhance the data available for modelling and risk assessment in real time. The following paragraph explains how.
SAS Viya is an in-memory architecture which allows training of low-level images using various pre-processing and hierarchical representation methods. Andrew Pease gave a good overview on how to achieve this with great use cases in his recent blog. Once images have been trained, classification can be made on test datasets with expected outcomes which can be fed into the real time decisioning flow.
SAS Viya is an in-memory architecture which allows training of low-level images using various pre-processing and hierarchical representation methods. For a syntax example see Figure 1 below, and visual examples are presented in Figures 2 and 3.
Figure 1: Model specifications using SAS Viya actionset in Jupyter Notebook
Figure 2: Pedestrian detection
Figure 3: Feature extractions from images
An alert is generated to flag if someone should be considered a threat or not. Apart from obvious merits, this approach minimizes false positives which could be costly both to the parties involved and the entity.
Once real time flow have been completed, web services can be triggered immediately generating alerts to notify if an activity identified is a genuine issue or not.
What are the business benefits?
Often, most analytics and data science teams are not optimally positioned within an organisation to achieve tangible success. Building fancy algorithms without setting business goals and integrating analytic outcomes into organizational operational systems is a wasted opportunity. Dr. Steven O’Donoghue, in his recent blog,details how to set up a high-performing analytics team within an organisation for success.
The benefit of augmenting existing datasets with image data can be tremendous. Time dependent situations where traditional data are not readily available can reap dividends using images to track different activities. These could be utilized to improve the existing system, decrease errors and provide a source of information otherwise not available. In banking, image processing can be used with existing datasets to screen loan and mortgage applications, credit card fraud and around ATM machines. Custom agencies can be adequately supported with proven methods to prevent individuals with past criminal records from smuggling hazardous items into the country. Images streaming from airport locations, boarders and dockyards CCTVs can be used to keep our country safe.
How do you use image processing datasets within your organisation to tackle existing business problems?
Can hackathons help?
Machine learning in general, and image processing in particular, are seeing intense development. Hackathons have emerged as a useful way to engage enthusiastic practitioners to push the frontier faster. But does everyone appreciate the value of hackathons, and understand the differences between them and datathons?
We hosted a digital discussion on Twitter to explore the following themes:
- What do you see driving hackathons?
- Who do they benefit most?
- How are differences between hackathons and datathons evolving?
- What have been the most interesting data science related competition topics?
- How have hackathons inspired you?
Read highlights of this discussion in Storify.