In his first game as an NFL head coach, Josh McDaniels and his Denver Broncos faced a 7-6 deficit with 38 seconds remaining in the game. The Cincinnati Bengals had just scored a touchdown and had the Broncos pinned deep at their own 13 yard line. On 2nd down, Kyle Orton’s pass got tipped right into the hands of Denver’s Brendan Stokley, who rumbled 87 yards for the winning score. The play has been immortalized as The Immaculate Deflection.
The 2009 Denver Broncos would win their first 6 games. This streak included games against New England, Dallas and San Diego. This impressive start prompted ESPN’s Tom Jackson to declare Coach McDaniels “one of the great ones”.
The 2009 Denver Broncos finished the season 2-8. They missed the playoffs when they lost to the Kansas City Chiefs, the 29th team in our rankings that year, in the season’s final game.
The 2010 Broncos started 3-9. Coach McDaniels was fired.
If you remember one thing about sports analytics, it should be this: never make a judgement after a small number of events.
How an average coach fares in an average league
Sports is inherently random. As The Immaculate Deflection shows, a team can win on a lucky bounce when they’ve only mustered two field goals the entire game. It’s not all that different from flipping a coin. To show the fallacy of looking at a small sample size, let’s look at how an average coach performs in an average league. Not too different from the NFL, this coach as a 50% chance of winning each game. Using a random number generator, these are the results of Coach Average’s first 50 games.
Just for the record, I was absolutely committed to generating this random sequence only once. There was no effort to find a sequence that had 6 wins in a row. Eight lines of Python code gave that result. Coach Average ripped off a sequence of 9 straight wins starting in game 19. Tom Jackson would be starting his petition for the Hall of Fame.
Moreover, Coach Average wins 31 of his first 50 games, 6 more than the expected 25 wins. In the modern era of the NFL, only Bill Belichick and Tony Dungy have better career winning percentages than the 62% of Coach Average. He looks extraordinary even over a sample of 50 games.
I actually generated a sequence of 200 coin flips, not knowing how many would fit in the image above. Coach Average won 106 of those 200 games for a 53% winning percentage. With a bigger number of coin flips, the winning percentage gets closer to the expected 50%. That’s the consequence of the Law of Large Numbers, the mathematical reason you should only draw conclusions after a large number of events.
What the famous “Hot Hand” paper says about all of this
In 1985, Amos Tversky, a Stanford psychology professor, and his colleagues published a paper called The Hot Hand in Basketball: On the Misperception of Random Sequences. Replace “The Hot Hand in Basketball” with “The Hot Coach in Football”, and this paper has everything to do with our discussion. There are two key results from their study.
First, they looked sequences of made and missed baskets for the two NBA teams and asked whether it looked any different from the random flipping of a coin. Does a made basket implied the next basket is more likely to go in? No. Were there more streaks of made baskets than one would expect from random? No. They even broke down the data into partitions of 4 consecutive shots to look at whether streakiness happened in short bursts. Did partitions with 3 or 4 made shots happen more than in the coin flipping model? Still no. It didn’t matter if it was field goals for the 1980-81 Philadelphia Sixers or free throws for the 1980-82 Boston Celtics. It just looked like a random sequence, much like our coin flipping model for Coach Average.
Second, they did a survey in which they gave people a sequence of X’s and O’s to represent made and missed baskets. This experiment isolates the random sequence from any sports related phenomena. The participant can no longer say they saw a streak of hits, or made baskets, because they are watching Andrew Toney or Kobe Bryant.
In a truly random sequence, a hit follows a hit with 50% likelihood. For these sequences, only 32% of participants called this random shooting while 62% called this streak shooting. People tend see streaks in randomness, just like you probably saw streakiness in the wins and losses of Coach Average. When the likelihood of getting a hit after a hit decreased below 50%, the participants in the study saw sequences that alternated between hits and misses more than in a random sequence. Only with this tendency for an alternating sequence did more people think the sequence was random than streaky. People falsely expect to see alternating hits and misses in a random sequence.
Even without all the biases inherent in sports, people see order in randomness.
What this means for sports fans
Luck and random chance play a huge role in the short term. Sports fans should not make rash judgements over a small number of events. Despite starting the season 2-10 or 4-10, Boston Red Sox fans should remember that they have a huge payroll and analytics on their side. When Albert Pujols hits 0.217 in this first 92 at bats, Los Angeles Angels fans should remember he’s one of the greatest hitters of his generation. Just wait a little bit.
However, luck tends to even out in the long run. Statistics over a huge number of events are very meaningful. In Tim Duncan’s first 15 years in the NBA, the San Antonio Spurs won 70.2% of their 1,182 regular season games. As of 2012, this is the best 15 year run by any NBA team. Moreover, in the 15 years prior to 2012, no professional team in the four major American sports had a better winning percentage. These types of statistics over a huge sample of games support the claim that Tim Duncan is the best player of his generation.
Thanks for reading.