Everyone loves a good contest.
At Predictive Analytics World 2012 (or #pawcon if you're on Twitter), Kaggle’s head of analytic solutions, Karthik Sethuraman spoke about “Crowdsourcing Predictive Analytics.” To date, around 30,000 data scientists have signed up on Kaggle to compete for cash prizes. Such competition pushes the edge of the envelop on predictive analytics.
An AllAnalytics blog described some recent competitions, speaking of the value of Kaggle’s diverse talent pool and the competitive dynamics online challenges build. The companies that are highlighted at PAW this year as having tapped Kaggle's data mining competition platform clearly benefited from creative solutions.
Kaggle winners most often use highly customized solutions
When I asked Mr. Sethuraman how much he thought Kaggle spurs innovation among mainstream analytics vendors, he replied, “We don't directly work with analytic vendors to improve their offerings. That said, most of the competitions push the boundaries of existing algorithms and many contestants either invent their own machine learning algorithm or refine existing ones.
"For example, in the Wikipedia contests, one of the contestants made refinements to the standard Random Forest algorithm to give him the much needed edge over other contestants. Through our blog and our participation in various analytic conferences, we disseminate the latest and greatest approaches that people use to develop predictive models. Analytic vendors, through this feedback, can refine their existing algorithms or include the upcoming and promising ones."
In 2011, when I heard Kaggle founder, Anthony Goldbloom, speak to analytics practitioners, he said the majority of award winners reported that they used SAS Analytics in their winning entry. It's great knowing that so many of the winning analytic contestants are SAS customers.
I was curious about how much predictive analytical work comes through Kaggle, rather than being handled by in-house teams or consultants. Mr. Sethuraman said, “This is hard to answer as we are still running pilot projects for many clients. However, it is fair to say many of the clients realize -- when it comes to predictive model development -- the Kaggle's platform is better, faster and cheaper. So we anticipate more of the analytic work will be done through Kaggle while the analytic teams at the client will take on a more strategic role by focusing on the key business problems to solve through superior analytics.”
It isn't all about winning or losing. It's about having fun, and do you have fun when you lose?