Could Australia win, really? The science of predicting the World Cup champion
- Written by Adrian Barnett, Professor of Public Health; Vice president of the Statistical Society of Australia, Queensland University of Technology
This article is the latest in our World Cup series exploring the politics, economics, science and social issues behind the world’s most popular sports event.
The 2018 World Cup kicks off today (Australian time). Australia is one of 32 teams hoping to be victorious in the beautiful game.
But what are the chances of Australia winning the World Cup? And how difficult is it to predict the winner?
One prediction model created by statisticians from Austria gives Australia a tiny 0.0022 probability of lifting the trophy – around 1 chance in 450. Given such poor odds, is it worth staying up late to watch the Socceroos play?
The same model predicts that Brazil have the highest probability of winning, at 0.163 (around 1 in 6). So Brazil are 74 times more likely to win the World Cup than Australia, and presumably Brazilians will be at least 74 times more upset than Australians if they don’t.
Read more: Why some Western companies are distancing themselves from the World Cup brand
Crunch the bookmaker’s numbers
How did this model come up with the probabilities for each team? It combined the betting odds from 26 bookmakers to estimate the overall ability of each team.
The bookmakers employ football experts to create their odds, so the statistical model taps into this expertise. The bookmakers’ odds also shorten if lots of money is placed on a team, so the model also taps into the wisdom of the crowd.
For the 2014 World Cup one of the models using the bookmakers’ odds correctly picked three of the four semifinalists (Brazil, Germany and Argentina), only missing the Netherlands. (It also predicted that Brazil was the “clear favourite”, but Germany won the tournament after thrashing Brazil 7-1 in the semifinal.)
There are alternative prediction models that use “Elo” ratings for each team. These ratings use the results of recent matches, which is sensible, but they can’t account for important off-field events like injuries to important players that the bookmakers know about.
Australia do even worse for the predictions based on rating, with a probability of around 1 in 2,500. So the bookmakers are more optimistic about Australia’s chances than would be warranted purely on the basis of form.
Don’t forget that while the bookmakers’ odds are useful, they are not the actual probabilities they believe that each country has of winning. Rather, they are designed to earn money for the bookmakers, based on the difference between their odds and punters’ perceived probabilities.
Simulating the tournament
To win the World Cup, teams need to finish in the top two in their four-team group, and then win four consecutive knockout games. The prediction models arrive at the overall winner by simulating the winner of each game based on the ability of the two teams.
The simulations use randomness. So if a team has a 0.7 (70%) probability of winning a game then it will win in around 7 in every 10 simulations. As every game is random, the simulations will sometimes create an unexpected tournament with two unfancied teams in the final.
We can estimate the probability of Australia winning the trophy by counting the proportion of tournament simulations in which they won. We ran 10,000 simulations using the rating-based model (thanks to Claus Ekstrøm for coding the World Cup simulations and data).
Australia won the World Cup just three times, and they upset the odds by beating the following teams:
- Nigeria (last 16), Spain (quarters), England (semi), Peru (final)
- Croatia (last 16), Portugal (quarters), Mexico (semi), Argentina (final)
- Croatia (last 16), Portugal (quarters), England (semi), Poland (final).
These results show that for Australia to win they need to pull off some remarkable wins and their path to the final needs to be cleared of the biggest obstacles. So Australian fans should be cheering for underdogs everywhere, especially in their half of the draw (groups A to D).
Simulating the entire tournament accounts for fact that some teams have an easier group draw, and might also avoid big teams in the knockout rounds.
For example, if we swap Australia (group C) and Saudi Arabia (group A) then this puts Australia in what looks like the weakest group, and their probability of winning increases from around 1 in 2,500 to around 1 in 1,400. A relatively big increase in odds, but still a huge long shot.
Football is wonderfully unpredictable
All the prediction models generate probabilities by counting the number of times a team won and dividing by the total number of simulations.
These probabilities consider thousands of possibilities, whereas the real World Cup will be run only once. Even if we use the bookmakers’ model to bet on the most likely outcome of Brazil winning, we would still be wrong five out of six times.
We are more likely to be wrong than right whatever team we pick, simply because there can only be one winner and there are many good teams. This unpredictability is what makes the World Cup so thrilling, although we still enjoy trying to predict the outcome.
None of the prediction models can account for random acts such as food poisoning, red and yellow cards or the Hand of God.
In the 2002 World Cup the semifinals featured two big teams, Brazil and Germany, but also had two outsiders, Turkey and South Korea. Turkey didn’t even qualify for this year’s World Cup and South Korea have an estimated 1 in 500 chance of winning.
Read more: Is Russia worthy of hosting the World Cup?
Most prediction models we have seen largely state the obvious, as the most likely four semifinalists are:
So in many ways the models concur with what most people with a good football knowledge would predict.
The models and experts also agree that Australia have almost no chance of winning. But their chances are still better than those of Italy, the Netherlands and the United States, none of whom made it to the tournament at all.
For this reason alone, it’s worth staying up late.