To listen to the audio version, click on the triangle or grab it on Apple Podcasts. The text version is below.
You’re interested in making sports predictions. The more accurate, the better.
To do this, we typically take a statistic and look at its correlation from season to season. This is especially relevant in preseason football as we look to predict the upcoming season.
If there is a high correlation from season to season, we say the statistic is predictive and include it in our models. If the statistic has a weak correlation, then it’s not predictive. Analytics 101.
However, this exercise can be confusing. For example, consider 3 point shooting percentage in the NBA. There’s a weak correlation from season to season, as a player’s 3 point shooting percentage from last season explains 14.5% of the variance during the current season.
I think there’s a problem with this analysis; more on that later.
In addition, the analysis doesn’t make sense. There is clear skill in shooting a basketball. Who would pick Russell Westbrook over Steph Curry in a three point shooting contest? No one.
In discussing the lack of correlation in 3 point shooting percentage, data scientists like myself usually say something like “randomness plays a big role in 3 point shooting percentage.” This is true, but not the entire picture.
In this article, I’ll discuss predictability versus skill and how they are related but different concepts.
Skill in shooting 3 pointers
Why is there skill in three point shooting?
To get some intuition, let’s take the following approach: assume 3 point shooting is random, and compare the actual data on players with this assumption.
I looked at NBA data from 2014 through 2020 before the lockdown. During this period, the average 3 point shooting percentage was 35.6%. If 3 point shooting is random, each player makes shots at this rate.
Based on this random assumption, a player’s 3 point shooting percentage won’t land exactly on the mean value of 35.6%. Some players will end up above this value, some below. But as a player takes more shots, his shooting percentage approaches the mean value of 35.6%.
When you look at many players based on this random assumption, you get a distribution of 3 point shooting percentages spread around the mean. If each player takes 2000 shots, the width of this distribution is about 1%. This implies that two out of three players will have a percentage between 34.6% and 36.6%.
Let’s compare this random assumption with actual players like Steph Curry. Steph has made 43.2% of his 3 point shots over the past six seasons. If each of Steph’s shots had a 35.6% chance to go in at random, Steph’s shooting percentage would be 9.6 standard deviations from the NBA average (based on 3,681 attempts). This is extremely unlikely.
This confirms what we all know: Steph Curry is a great shooter, probably the best to have ever played the game.
In contrast, Russell Westbrook has made 30.4% of his 2,150 attempts from 3. His percentage is five standard deviations below the NBA average. That’s the kind of ineptitude you expect if you made Dave Gettleman the CTO of your sports analytics startup.
Westbrook is not the worst in the NBA over this period by the standard deviation analysis. More on that later.
These outliers seem to confirm the skill in three point shooting. Let’s put some numbers behind this.
Skill vs luck
To distinguish skill from luck, I’ll use an idea from Michael Maubossian’s book The Success Equation. He defined a model in which outcomes are a combination of skill and luck.
outcome = skill + luck
For 3 point shooting percentage, some of a player’s results comes from skill while the remainder comes from luck. There’s always some randomness. A shot is not always going in, even if Steph Curry is wide open.
Consider the variance in outcomes. Based on this simple model, we get:
Var(outcome) = Var(skill) + Var(luck)
In taking the variance of the previous equation, there is usually a term that considers the correlation between skill and luck. However, by definition, I’m assuming that there’s no correlation between skill and luck. Every player has an equal chance to get lucky.
Let’s go back to our random assumption in which every player makes shots at the same rate. By taking the standard deviation of each player from NBA average, we get a normal distribution with variance of 1. There is no skill in this model.
Let’s compare this assumption with the actual data on NBA players over the past 6 seasons. The wider this distribution in 3 point shooting percentage, the more skill in 3 point shooting.
To measure skill, I consider the fraction of variance in 3 point shooting percentage, Var(outcome), explained by skill, Var(skill). This is similar to the previous idea of predictability. A player’s 3 point shooting percentage from last season explained 14.5% of the variance in a player’s 3 point shooting percentage this season. To get a visual explanation of this concept, click here.
For both predictability and skill, we ask how much of the variance in outcome can be explained by another quantity.
- For predictability, how does last season’s data explain the variance in this season’s 3 point shooting percentage.
- For skill, how much bigger is the variance in player 3 point shooting percentage than the variance based on the random assumption.
In the NBA, 78% of the variance in 3 point shooting percentage is explained by skill.
To put this into perspective, let’s look at Maubossian’s results on teams. If winning NFL games were all luck, each game would be a 50-50 coin flip. The distribution of win percentage for teams would have a certain width based on a 16 game season.
The actual distribution of win percentages is wider than the random assumption, and Maubossian calculated the following:
- NFL: 62% of variance in win percentage is explained by skill
- NBA: 88% of variance in win percentage is explained by skill
There is more skill in shooting 3 pointers than winning NFL games, as skill explains 78% of the variance in outcomes. However, there is less skill in shooting 3’s than winning NBA games.
For another perspective, let’s look at free throw shooting. Based on the same 6 season NBA data set, skill explains 98% of the variance in free throw shooting percentage.
On the good side, Damian Lillard is 16.8 standard deviations higher than average. On the bad side, Andre Drummond is more than 33 standard deviations worse than NBA average. That is some massive Dave Gettleman ineptitude.
This analysis supports the idea of skill in shooting a basketball. Almost all of the outcome in free throw percentage is skill. It’s the player and the basket. The analysis reveals less skill in 3 point shooting, presumably because of increased randomness due to factors such as defense.
Unlike 3 pointers, free throws are also highly predictable. A player’s free throw shooting percentage from last season explains 70% of the variance in free throw shooting percentage in the current season. We expect that high degree of predictability when a statistic is 98% skill.
3 point shooting percentage is not a strong predictor. From before, a player’s data from last season explains 14.5% of the variance in 3 point shooting percentage this season. This is despite the analysis that 3 point shooting is 78% skill.
3 point shooting is a skill but not predictable. Let’s look at an example.
Unpredictability of 3 point shooting
In performing this analysis, Duncan Robinson of the Miami Heat showed up as one of the best shooters in the NBA. In the 2019-20 season before the lockdown, he made 45% of his 3 point shots.
Robinson played his college basketball at Michigan. He was a senior in the spring of 2018 when Michigan made a run to the NCAA tournament title game against Villanova.
Robinson was a great shooter in college. This was obvious either from looking at his shooting motion or his numbers from his first two seasons at Michigan. But as a senior, he only shot 38% from 3.
However, those 203 eight attempts his senior year didn’t provide a sufficient sample to predict future performance. In his second NBA season, Robinson has shown his skill as a three point shooter.
In contrast, Giannis Antetokounmpo does not have skill in shooting 3 pointers. Over the past 6 seasons, the Greek Freak was the one player worse than Russell Westbrook. Only an MVP caliber player can make 28% of his 3 pointers and still take almost a thousand attempts.
Better 3 point shooting predictions
There’s another problem with looking at year to year correlations in making statements about predictability, especially in pro sports. We have multiple seasons of data on pro players. This is not college.
To see how an increased sample helps predictability, I took the six season NBA sample and asked how five seasons of data could predict the remaining season. To do this, I took a player and picked one of the six seasons at random. To include this player in the analysis, he needed 100 attempts in the target season and 300 in the remaining seasons.
A five year sample was able to explain about 24% of the variance in 3 point shooting percentage in the target season. This doesn’t make me run to put this statistic in a predictive model. However, five seasons gives about a 60% improvement over one season.
For free throw shooting, a five season sample explains 72% of the variance in free throw shooting in the target season. Based on the 70% value from one season, four addition seasons result in about a 3% improvement.
There’s a lot of randomness in three pointers, and a larger sample gives a significant boost in predicting the future. There is less randomness in free throw, and one season is a decent sample to predict the future.
Predictability vs skill in the NFL
Here’s the take home message: predictability and skill are related distinct ideas.
In this analysis, predictability is the correlation of a statistic from an earlier to a later time period. I’m defining skill in terms of the distribution of player statistics over a six year period in the NBA. The variance of this distribution in excess of a random assumption is defined as skill.
Usually, predictability and skill are related. We saw this with free throw shooting. However, skill does not imply predictability. 3 point shooting is a skill, an intuitive results confirmed by the analysis in this article. However, a player’s three point shooting percentage in the past struggles to predict the future.
Next month, we’ll see how these ideas apply to the NFL. In particular, we’ll look at the 32 most important men in sports: NFL quarterbacks.
To make sure you get this analysis, sign up for my free email newsletter. In addition, each week during football season, you’ll get a sample of my best predictions usually saved for paying members of the site.
To sign up, enter your best email and click on “Sign up now!”
David Repici says
an interesting idea for a short article on sports betting predictability + skill.
Look forward to seeing more of the same creativity.