An epic clash across the continent of Europe
Every year since 1956 the nations of Europe (and now beyond) have come together to decide who has the best popular song. The contest has launched the careers of many global pop stars, most notably the unforgettable Swedish foursome, ABBA. No one of my age (I’m 47) can plausibly deny having ever owned an ABBA album or never watched, at least some, of the Eurovision song contest.
The tradition is for the winner to host the next event, like the Football World Cup. This year it’s in Copenhagen (Denmark) as the Danes, deservedly in my opinion, won a clear victory in 2013. Tension is mounting, but who will be this year’s winner? (We’ll find out on Saturday).
Can data science help predict the result of Eurovision?
Recently, there have been a spate of attempts to predict the outcome of Eurovision using data science. And Eurovision has been the subject of many scholarly research projects. Now, I’m not a data scientist, so I’m not going to stick my neck out too far here. But let’s look at how data scientists, and others, have approached this problem.
Understanding the rules
Before attempting to predict the outcome of such an event, we need to be – at least passingly – familiar with the rules. To cut a long story short, voting is split between a panel of judges from each entrant country and interactive voting from viewers. This wasn’t always the case, and judge-only results were much more predictable. This year the balance will be 50/50 between viewers and judges, making the outcome less predictable and more exciting.
Each panel can award points in descending order to their favourite acts, the maximum number of points is 12 (douze in French, hence the famous saying “douze points”), but they are not allowed to vote for their own entry. After passing through some semi-finals, 32 countries are whittled down to 20. The host country automatically gets a place, so do the ‘big five’ Italy, Spain, France, the UK and Germany. This is because they pay most of the bill, so it’s kind of fair enough really.
Two competing predictive models
Debate rages between whether it’s the quality of the song counts, or if it’s all just about politics. Long-term analysis of voting patterns reveals various ‘blocks’ of countries or are minded to vote for each other and return the favour. A recent study by Dr. Baio and Dr. Biangiardo of University College London revealed some hidden patterns in the voting data (click here to view their paper). They showed that voting falls into four large groups; one combining the former Yugoslavia, Austria and Switzerland; one covering central and southern Europe; and finally a large group comprising of Scandinavia, the United Kingdom and Ireland and the former Soviet bloc which will cleave randomly into two blocs.
This leads us to our first theory, essentially the best predictor for how a country will score is how they did in the past. Eurovision has changed a lot since its inception in 1956, so the data below only goes back to 2003. On past form, our winner would most likely be Azerbaijan.
A counter argument is less sure, past history and international loyalties have an effect, but surely the song is important too? The interactive, viewer participation is supposed to help this to happen, and it has been reasonably successful since its introduction. Actually, the link here is simple: countries that do well in the semi-finals, also do well in the final. The trouble is here, that Eurovision keep the voting results secret until after the final.
Another potential factor is the type of entry. Since 2003, the contest has been won by a female singer seven times, a male singer three times and a group or duo once each.
Data mining big data
A way around this could be to data mine social media interactions, such as tweets. Some data scientists have had considerable success using this approach combined with sentiment analysis. Alternatively, you could take a poll of people and appropriately sample the poll to remove bias. One poll, esctoday.com, uses a Facebook “likes” application to poll people. Of course such a sample is biased, but may give some indication. As of 7th May, poll is was giving victory to Romania.
Making your mind up
In 1981, winning UK group ‘Bucks Fizz’ sang ‘Making your mind up’, memorable for their rapid costume change (you had to see it). So, now is the time for me to announce my prediction. I’m taking a big jump here and theorising that poll results will influence half of the score and past form the other half. I’ve also theorised that a female singer will get a 6% boost, a male singer a 7% reduction and a group or duo a 10% reduction. Also the current political situation in Russia and the Ukraine may have an impact on voting for those nations.
That will give the top slot to Sweden. With Italy, Greece, Romania and Hungary as possible challengers. Am I right, what do you think?
Could you do a better job?
SAS UK and Ireland are looking for their top data scientist in a competition open to residents of the UK Ireland. Click here for details.