Bookies have long turned a trade in predicting the fate of our politicians in the general election. According to Ladbrokes, gamblers are set to spend a staggering £100m betting on this year’s result.
The outcome of the May 7 vote is anticipated to be the hardest election to predict in recent memory. For the first time ever it’s conceivable that the joint vote share of the two main parties might be under 60 percent.
In 2012, Nate Silver, author of the exceptional data blog FiveThirtyEight, famously used the same analytic models he applied to sports betting to predict the US presidential election result. Crucially, each contest had a relative likelihood of success, and mounting those probabilities up across the whole election returned a remarkably accurate result.
However, the UK political scene has become a little more complex. No longer a simple red vs. blue contest, parties historically considered to be on the fringes have taken the fight to the incumbents. The Liberal Democrats (although in fairness an incumbent themselves), Scottish National Party (SNP), United Kingdom Independence Party (UKIP), Green and Plaid Cymru (Party of Wales) have divided voters and complicated our forecasting models.
The UK system dictates that the party with the most seats wins, but each of the 650 seats have to be fought for one by one. The importance of a majority in each location or constituency adds another layer of intricacy to our calculation.
Historically we’ve been able to predict seat outcomes by factoring in the change in the opinion poll as compared to the last election. For example, if one party won the seat with 40 percent of the votes, and their opinion poll rating has dropped by 10 percent you could reduce that 40 percent by 10 percent, giving 36 percent, meaning they may lose the seat. However now that we’re looking at a six party race, split across 650 seats, a more intricate model is required. In a bid to show what Parliament would look like based on the latest polls, The Guardian has produced an interesting projection methodology.
Whilst the model has become more complex, the good news is that there are a number of data points that we can add into our calculations. Opinion polls are the most traditional source of up-to-date information on which way the public is leaning. However polls can occasionally mislead us as they did in 1992, where the final polls predicted a 1.4 percent Labour lead but the Conservatives won by 7.6 percent. Betting markets are often touted as a reliable source, with Professor Leighton Vaughan Williams, director of the Political Forecasting Unit at Nottingham Business School, claiming they are more accurate than the polls.
The emergence and mining of social media data can track party and voter sentiment. For the first time in the UK, apps are available which enable the general public to follow the trends and gain insight into the mood around the main parties. However, social media tends to be a fairly biased sample and can mislead. When it came to the referendum on the Alternative Voting system in the 2011, social media suggested a big win for AV, whereas in fact the status quo won out. To make meaningful predictions about the result in constituencies, or indeed nationally, you need the capability to analyse a much wider pool of data.
Search engine data sourced remarkably accurate results for the referendum on Scottish Independence. However, given certain party leaders’ penchant for headline grabbing statements, search volumes could be more of an indicator of celebrity than of potential success.
In the next blog, I’ll look at how SAS can access open source map data to begin to translate sentiment into seats. But in the meantime, if you’re keen to find out more about data science and the government, check out our research with Civil Service World on Big Data in the public sector.