Just got back from two days at
Predictive Analytics World in Washington DC. The event's chairperson,
Eric Siegel, has an excellent conference on his hands! The attendees are analytically savvy and are using a variety of approaches to deliver value to organizations. It was a very invigorating couple of days!
I have several highlights to share but I'll start with
Keep Winning the Eternal Fraud Battles from Elder Research's
Antonia de Medinaceli.
It was a great presentation - and I think there were some very tasty bites that those engaged in perfecting their organization's fraud strategy should consider. What were de Medinaceli's suggestions?
There were 8 key recommendations and the audience of cross-industry analtyics experts - or 'numerati' as keynote speaker
Stephen Baker would call them - paid rapt attention.
1) Compensate for needles in the haystack. de Medinaceli recommended that organizations pay attention to potential pitfalls with
up-sampling and
down-sampling and remember to set up training and testing sets FIRST.
2) Pre-process your data carefully. With fraud, it's even more important to handle data integrity issues. de Medinaceli' recommends organizations remember that
fuzzy matching is crucial, borrow techniques from
text mining and that when possible adapt your data entry GUI to improve data quality (for example, use guided drop down menu versus free-form fields).
3) Be aware of Institutional Challenges. An effective fraud strategy has to consider cultural impacts. de Medinaceli highlighted that being sensitive to PR considerations, using thoughtful terminology (for example, the process will "flag" potential fraud versus "decide" fraud) and remembering that fraud investigators can fear that a solution will replace them will help for smoother adoption. An interesting point was that Elder Research has found that most clients implementing a fraud framework grow that part of the organization.
4) Set misclassification costs at time of training. Set your thresholds!
5) Consider your workload constraints. Work to improve your hit:scan ratio. (For example, a ratio of 1:200 means you have to look at 200 cases to find 1 fraudulent case.) As you work to improve the ratio - and also consider the volume of cases your can process - you will find an additional benefit. Investigators often have a great deal more job satisfaction because case work and volume are rewarding.
6) Insist on accurately labeled historical data. One of the examples shared was that as cases flagged for fraud are investigated, if they are found to be false positives you want to feed that data back into the system for inclusion in the analysis and improvement in learning.
7) Look for collusion. de Medinaceli said that "Breakout Fraud" or collusion where every member flies below the radar by measured small amounts of fraud or one big case - is popular right now. Link analysis algorithms are very useful in detecting this type of fraud.
8 ) Prepare for an ever-changing landscape. Fraudsters constantly refine and expand schemes. de Medinaceli said the fraud model has to be closely guarded, updated very frequently compared to traditional models and constantly growing subject matter expertise is crucial.
I have so many more tid bits to share from my time at the conference, but I highly recommend that you put the next
Predictive Analytics World San Francisco (Feb 16-17, 2010) on your list!
Yesterday as I was sitting in the SAS Media Day room I was so impressed by the number of journalists from around the world that are in attendance. I met folks from Poland, Russia, Brazil and the list goes on. The panel I moderated on optimization was
Tracked: Oct 28, 17:17