This week kicks off the World Cup, and while the U.S. didn’t qualify for the most watched event in the world, that’s not stopping US groups from making predictions based on each and every data set known to sport. Back in 2014, Bloomberg and FiveThirtyEight gave it their best go at analyzing the stats, to form percentage probabilities of teams making it out of their group and onto the next round.
Unfortunately for proponents of algorithms and quants, the teams with the highest probabilities per FiveThirtyEight didn’t reign supreme. FiveThirtyEight gave Brazil a 45% chance of winning last time around and only gave Germany a 10% chance of winning, and in the end, Brazil didn’t even make the final. Oops.
This time around, Bloomberg is going off of UBS predictions which use a team of 18 analysts to run a computer simulation of the tournament 10,000 times. Here’s the data on how the run that simulation. UBS says there’s a 60% chance Germany, Brazil, or Spain wins. That’s a pretty high number! As for FiveThirtyEight, they run 20,000 simulations based on the SPI (Soccer Power Index). Here’s how the SPI is determined:
FiveThirtyEight gives Germany, Spain, and Brazil the highest percentages of winning the whole thing. Here’s what the breakdown looks like for three of the groups:
Algorithms <> Absolute Certainty
The quants in us love this sort of simulation, the pretty charts and advanced statistics on how far teams can get in the tournament. But there’s one big problem with all of this. Probabilities explain a range of possible outcomes, which our human brains don’t do such a good job of understanding. Can we really fathom the difference between a 56% chance of getting out of the group stage and 64% chance? What’s more, is it even possible to get that granular on the probability of these teams?
After all, it’s all based on past performance, as in these statistics might be the best model we have to know what might happen in the future, but that doesn’t actually mean it’s going to happen. It’s not like predicting the percentage probability of rain in the very near future, based on the current air pressure and models, or predicting elections based on the number of registered Democrats, Republicans, and Independents and exit polls where you can glean a bit of inside information. There’s no statistic showing how each player will perform in each match. That’s why sports are exciting because you don’t know who is going to win.
Case in point, the news out of Spain, that they fired their coach coming into the tournament.
Sure, Spain is given a 17% chance, but can the models adjust to the fact they no longer have the coach that led them to the tournament? Also, experts have said that Russia has a better chance of advancing because they are playing in their home country. Is that because morale will be up? What sort of statistical proof is there? That same reasoning was used to indicate Brazil would win last time because they were hosts and it didn’t happen. It’s like each and every financial talking head that talks about the next crash that’s going to take place. When they’re wrong, they really aren’t any consequences. But when they’re right, they get a book deal and a tv show.
All of this is to say, these numbers are just guesses. They’re very, very educated guesses, using all the latest in modeling and systematizing human analysis. But still just guesses. And you can’t even know really when they are correct. You can see above they are done over 10s of thousands of model runs – but then tested on just a single run. It would be a quants nightmare. You can run your model on reams of data, but then get just one trade to see if it works. These models won’t’ be proven effective or not for dozens of world cups, where hundreds of matches are filtered through.
How we Perceive Probability
Still, it’s fun to pick a team. To pick a team because you saw the statistics and want to side with the winners. Or side with underdog that has the best statistical chance of winning. In this case, it would be Belgium, England, Argentina, and France (all with roughly a 7-9% chance of winning according to FiveThirtyEight). But humans have a really hard time understanding what those percentages actually mean. If our whole world starts to rely more and more on data-driven probabilities, we humans need to understand them a lot better. Our un-scientific graph of how human’s perceive probabilities versus the actual probability of something happening are as follows:
And of course, if it’s your own team or you have some preconceived cognitive bias working up there in between your ears, it might even look more like this:
Don’t get us wrong, cheering on a team can be fun, but know that the statistics don’t really have a way of letting you know what’s going to happen because it’s based on human behavior itself. And it can be easy to think that a 70% probability based off of tens of thousands of simulations really means a 99% percent chance of it happening in your head, but it’s best to ask yourself what that really means.