Tuesday, April 22, 2014

On Regression to the Mean

I didn't play in the Barbu game at Lounge Day, but when I came back from eating at Mel's after my Titan game I was greeted by Andrew yelling at me about how regression to the mean doesn't exist because Kevin won the Barbu game and Andrew thinks Kevin was the worst player at the table and therefore if regression to the mean was a real thing Kevin should have lost the game.

I struggle to handle this. Is Andrew being intentionally obtuse? (Something he was rather vocal about disliking earlier in the weekend.) Maybe Andrew has absolutely no understanding about probability and statistics? (It may explain that history degree...) Maybe there's some other explanation for why he's being so vocal about ignoring what I see as basic facts about randomness, but I'm not seeing it. I feel like he's probably misusing this term because 'stats geeks' who follow professional hockey used it when his preferred NHL team, The Toronto Maple Leafs, got off to a good start to this latest season. They all predicted gloom and doom for the Leafs, and said gloom and doom came true with the Leafs missing the playoffs.

In the hopes that maybe he's just not understanding what is going on I'm going to explain what the term actually means. At its core regression to the mean is a concept which basically says when you have two independent events it doesn't matter what the outcome was for the first one; the second one expects to be 'average'. It's used when dealing with a large sample of outcomes, and is something experimenters need to keep in mind in order to deal with the innate randomness that may be going on under the surface.

With the Maple Leafs the 'stats geeks' were looking at the way the Leafs were actually playing at the start of the season and not on the actual outcome of the games they played. They look at things like how Phil Kessel shot 18% in the first 14 games of the season but shot more like 11% over his career. You can look at his hot start (9 goals in 14 games) and extrapolate that something happened to Phil Kessel this offseason that made him awesome. You could then assume he's going to put up 54 goals on the season and the Leafs were going to keep winning games and win the Stanley Cup and PLAN THE PARADE. The 'stats geeks' on the other hand looked at it, saw that he was probably just getting lucky bounces since his shooting percentage was significantly higher than expected, and that he was apt to cool off a little. I believe the same was true of their goaltending as well, who were playing better than they had historically. As far as puck possession, which the 'stats geeks' have found to be a better predictor of future success than goals or wins, Toronto was something like 29th out of 30 teams this season. The underlying stats showed they were really bad, and the most likely reason for their early success was simple dumb luck.

Now, it's important to point out that regression to the mean doesn't say we expect Kessel to miss a ton of shots so he ends up back at 11% shooting percentage again. He's not 'due' a cold streak to go with his hot streak. That's the gambler's fallacy. No, all regression to the mean is stating is that for the last 83% of the season we expect them to play around average. It just happens that average for the Leafs is worse than everyone but the Sabres and it wasn't likely that their hot start was going to salvage their season. All the hot start meant was that they were likely going to finish above their expected finishing spot. (Which should be a scary thought for Leafs fans, since they have the #8 pick in the draft with a team that's probably 'good enough' to have earned the #2 pick.)

So how should regression to the mean be applied to a money Barbu game in the Lounge? Realistically, not at all. There's a lot of randomness in any given hand of cards and you only play 32 deals total in a game of Barbu. Regression to the mean tells us nothing at all about how a given hand will play out except that we can expect it to be 'average'. But average in a game where one lucky card fall or opponent misplay takes you from scoring -252 to scoring +72? That's really not telling you a lot. What it does tell us is that should a 'bad player' get lucky in the first hard they do get to keep all those 'unjustified' points. Regression to the mean means they're apt to be average for the remaining 31 hands but they get to keep the good result from the first hand so overall they rate to finish higher than expected.

Nevermind that I'm not convinced that Kevin is actually the worst player at the table, let alone the worst player by such a large skill margin that he rates to be super negative. As far as I am aware the people at the table have played Barbu something like once every year or two; I'm sure everyone was making some silly mistakes which just serves to amplify the randomness. A bad player is less likely to get punished when the other players are screwing up. I don't think I've played a game of Barbu since Byung's bachelor party back in 2008 and while my brain is still telling me I'm awesome at the game I'm sure the reality is that I'm rusty and will make some silly mistakes rounding back into form.

We could apply regression to the mean if we set up a large enough sample size, though. Like, say, if we were to build out a money Barbu circuit. Tag me in with the 4 people who played the game on the weekend and let's play 50 games total with each player sitting out 10 games. Now we're talking each person playing 1280 hands. Now we're talking enough current experience to shake off a lot of the rust for most of the games. Someone getting lucky in any given game would impact their overall final result, but we expect the other 39 games to fall in line with the expected result and someone who is actually the worst player at the table really expects to finish negative and near the bottom of the pack.

I think if we actually did such a circuit I would finish at the top. I think Sky would also be positive. I think the other three would all be negative. I'm actually pretty sure that's how it would shake out. Because regression to the mean should kick in over that large a span of games and I am the best.

No comments: