How to instantly evaluate a football game

During the 2014 season, Oregon looked like a clear favorite over Ohio State to win the College Football Playoff title game.

After an early loss to Arizona, Oregon dominated during the last part of the season. Only UCLA came within two touchdowns of beating Oregon. This stretch of games included a rematch against Arizona and the playoff semi-final against Florida State.

Ohio State barely made the college football playoff after an early loss to Virginia Tech, a team that went 3-5 in the ACC. They lost two quarterbacks during the 2014 season, and third stringer Cardale Jones only seemed to excel because he had talented receivers to catch his jump balls.

In addition, the markets opened with Oregon as a 7 point favorite, which implies a 70% win probability. Slam dunk, Ducks.

However, Ohio State dominated Oregon in a 42-20 game to claim the first College Football Playoff. The Buckeyes earned this massive margin of victory despite committing 3 more turnovers than their opponent.

Was there any way to predict this Ohio State victory? There was, but only if you dug past team rankings and looked into how Ohio State matched up with Oregon.

For me, data visualization played a key role in uncovering the key match up. Let me show you.

Oregon’s match up problem

In 2014, Ohio State had an elite ground game. To quantify this, let’s look an efficiency statistic: yards per carry. In college football, sacks count as rushes in the official statistics. Since sacks are pass plays, I exclude these plays in calculating yards per carry.

To adjust yards per carry for strength of schedule, I use a ranking algorithm I developed based on my research in statistical physics. While Ohio State had the 7th best raw yards per carry, these schedule adjustments move them up to first.

In contrast, Oregon had an average rush defense. They allowed 5.0 yards per carry, more than the 4.8 college football average. After schedule adjustments, Oregon ranked 62nd out of 128 teams in rush defense.

Data visualization to evaluate match ups

To look at how Oregon’s rush defense matched up against Ohio State’s rush offense, we can use data visualization based on data prior to the title game. This visual explains how it works.

matchup_sales_2015_cropped

For defenses, the better units appear further to the right. This makes it easy to compare with the opposing offense when both units appear on the same line.

In the visual below, the blue dots represent Ohio State’s pass and rush offense while the smaller green dots show Oregon’s defense. Better defenses appear further to the right to facilitate comparisons, as you’re looking at how a unit compares to average.

Ohio State's offense vs Oregon's defense

The gap between Ohio State’s rush offense and Oregon’s rush defense shows the clear advantage for the Buckeyes.

During the championship game, Ohio State didn’t have remarkable team rushing numbers, as they gained 5.2 yards per carry. However, running back Ezekiel Elliott dominated the Oregon defense by rushing for 246 yards on 6.8 yards per carry and 4 touchdowns.

My analysis of this rushing match up appeared on Deadspin prior to the game, and this comment appeared below the article.

It is the start of the fourth, and it is creepy how on point your predictions are.

commenter on Deadspin

It doesn’t always work out this way. Football has too much much randomness to be right all the time. But analytics provides a firm baseline for your judgments about football.

Predictions based on match ups in football

Members of The Power Rank have access to my ensemble predictions, which aggregate together many predictions to make a more accurate prediction. Before the Ohio State versus Oregon game, this ensemble predicted a 3.2 point win for Oregon, which corresponded to a 59.5% win probability.

However, you should never blindly trust numbers, especially in a game with mismatches. One of the predictors in the ensemble accounted for passing and rushing separately for each team. It considered Ohio State’s significant edge in running the ball and that Ohio State ran the ball on 59.3% of plays.

This match up model predicted a 50-50 game between Ohio State and Oregon.

A cheat sheet for every team saves time

Members of The Power Rank also have access to interactive team pages that show these match up visuals. To view a match up, click on the appropriate opponent in the schedule in the upper right corner. To check Ohio State’s team page after the title game against Oregon, click here.

I use these interactive visuals to prepare for every interview, whether its the Paul Finebaum show or my weekly appearance on WTKA in Ann Arbor. The visuals save a ton of time, as I can scan through the visual for both passing and rushing to find a potential mismatch.

How to predict interceptions in the NFL, backed by surprising science

photoTurnovers play a critical role in football.

A tipped pass for an interception or crushing hit for a fumble can decide a close game. No coach emerges from a press conference without touting the importance of winning the turnover battle.

However, not all turnovers are created equal. In the 2013 NFL regular season, teams with more interceptions than their opponent won the game 80% of the time. Teams that forced and recovered more fumbles than their opponents won the game 70% of the time.

Interceptions have a bigger impact because the defender is most likely on his feet after the takeaway. This can lead to a big swing in field position or even a score. Defenders that recover fumbles tend to fall on the ball.

What factors affect interceptions in the NFL? Here, we’ll look at the surprising analytics behind interceptions.

You can do better than guessing that each team will throw picks on 2.9% of pass attempts, the NFL average. And it doesn’t involve an arcane statistic that comes from charting games. The critical numbers are in the box score, although it might not be the numbers you expect.

We’ll also look at how this analysis changes the predicted point spread for a game.

How pass rush affects interceptions

Seattle cornerback Richard Sherman led the NFL in interceptions in 2013. Despite all of his public claims about being the best cornerback in the league, Sherman credits his front seven for much of his success.

Pass rush is an obvious candidate to affect interceptions. The more often a defense applies pressure on the quarterback, the more often he throws an errant pass. Or perhaps the defender strikes the quarterback’s arm, causing a wobbly pass to fall into the hands of the defense.

To study this, we need to measure the strength of the pass rush. To start, let’s look at sacks, a number that requires proper context. A defense might rack up more sacks by facing more pass attempts. To account for this, let’s use sack rate, or sacks divided by the sum of pass attempts and sacks, as a measure of pass rush.

To determine whether pass rush causes interceptions, consider NFL defenses in the regular season from 2003 through 2013. While I expected defenses with a better sack rate to have a higher interception rate, there’s no correlation between these two quantities for these 352 defenses.

For those with a technical inclination, sack rate explains less than 1% of the variance in interception rate. For everyone else, check out the left panel of the visual in the next section.

Richard Sherman might be a great cornerback because of Seattle’s pass rush. However, his pass rush doesn’t explain his high interception total in 2013.

How pass protection affects interceptions

If pass rush has no effect on a defense’s interceptions, what about pass protection on offense? An offensive line that keeps pass rushers away from the quarterback might result in fewer interceptions.

Over the same 11 regular seasons, the sack rate allowed by an offense explains 6% of the variance in the interception rate. While this correlation is stronger than on defense, I still do not recommend using sacks to predict interceptions. The right panel of the visual shows why.

nfl_sack_pick_corr

We can dig even deeper into pass protection. Over the last 5 seasons, the NFL has tracked QB hits, or the number of times the quarterback gets hit after releasing the ball. We can now calculate the rate at which an offensive line allows the hits on the quarterback (the sum of QB hits and sacks divided by the sum of pass attempts and sacks).

This QB hit rate gives a better perspective on pass protection. An offensive line might look good because of a low sack rate. For example, Indianapolis gave up sacks on 5.2% of pass attempts in 2013, 5th best in the NFL.

However, this same offensive line allowed a hit rate of 23%, 26th worst in the NFL. Andrew Luck’s ability to get rid of the ball in the face of pressure played a big role in their low sack rate. The lack of protection probably also contributed to Luck’s below average completion percentage of 60% in 2013.

However, even a better statistic like QB hit rate doesn’t correlate with interceptions. Hit rate explains 4% of the variance in interception rate, a weaker correlation than shown in the right panel of the visual.

The data does not support the belief that pass rush affects interceptions. I would guess this comes from the ability of NFL quarterbacks to not let pressure to affect their accuracy. Of the thousands that play in high school and hundreds that make it to college, only 32 can play in the pros. These quarterbacks do not fold under pressure.

However, these 32 quarterback do vary in their accuracy, and that might impact interceptions.

How throwing accuracy affects interceptions

Despite the wobbles of the his balls, Peyton Manning has shown incredible precision with his throws. Over his career, he has completed 65.5% of his passes. Of active players, only Drew Brees and Aaron Rodgers have a better career completion percentage.

However, Peyton has gotten even better after having multiple neck surgeries. In his last two seasons with Denver, his completion percentage has increased to 68.4%.

Do more accurate quarterbacks throw fewer interceptions? Any fan would rather have Manning and Rodgers leading their offense than Derek Anderson or Brady Quinn. But are fewer interceptions a consequence of a better quarterback?

To answer this question, consider the career statistics for NFL quarterbacks in 2013 with at least 500 career pass attempts. The visual of these 52 players shows the negative correlation between completion percentage and interception rate.

nfl_qb

Peyton Manning is the third point from the right, and he has thrown picks at a higher rate than this regression analysis predicts. Aaron Rodgers has the lowest interception rate of the 3 quarterbacks with better than 65% completion rate.

The outlier with the lowest interception rate is Nick Foles, the second year quarterback with Philadelphia. As much potential as he has shown, he will not continue to throw interceptions on 1.2% of his pass attempts. The same applies to San Francisco’s Colin Kaepernick, the point with the second lowest interception rate (1.7%).

This correlation does not imply that better accuracy causes fewer interceptions. But this conclusion does seem logical. The quarterback has control over where he throws the ball. The more control he shows, the less likely the ball hits the hands of a defender. There are better ways to look at this causation, but they will have to wait for another day.

For these quarterbacks, completion percentage explains 32% of the variance in interception rate. In the noisy world of football statistics, that’s as strong a relationship as you will see between two statistics. In addition, the correlation also exists for the regular season statistics of offenses from 2003 through 2013. Here, completion percentage explains 20% of the variance in interception rate.

With this strong relationship between accuracy and interceptions, how can we modify a point spread prediction for a game?

How interceptions affect the point spread

To use these results to adjust a prediction, let’s look back at the Super Bowl between Seattle and Denver at the end of the 2013 season. Before the game, the team rankings at The Power Rank predicted Seattle by 1.3 points, which implied a 46% chance for Denver to win.

Denver had a lower likelihood to throw a pick based on Peyton Manning’s accuracy. On average, NFL quarterbacks throw interceptions on 2.9% of pass attempts. With Peyton’s 65.5% career completion percentage, the regression model predicted he would throw interceptions on 2.56% of pass attempts. For a league average 35 pass attempts, this meant 0.14 fewer interceptions for the game.

While such a small fraction of picks might seem inconsequential, the impact of such a turnover makes it matter. From the relationship between interceptions and points in NFL games, the average interception is worth about 5 points. This changed the predicted point spread by 0.7 points in Denver’s favor. Seattle’s predicted margin of victory dropped from 1.3 to 0.6, which increased Denver’s win probably from 46% to 48%.

The game didn’t go Denver’s way. Seattle’s defenders knew what mouthwash Manning used before the game since they spent the entire game in the backfield.

The outcome of interceptions in the Super Bowl

Manning thew 2 interceptions. The first was an errant pass that landed in the hands of Cam Chancellor, a play in which Manning wasn’t pressured that heavily. The second pick came when a defender hit his arm on a throw. The football wobbled into the hands of Malcolm Smith, who ran for a Seattle touchdown.

For the game, Manning thew 49 passes, so the two picks implied a 4.1% interception rate. Even with this small sample size, that is not an outrageous rate. If not for the bad luck on the pick in which his arm got hit, Manning would have had a 2% rate.

Common sense says that pass rush and throwing accuracy affect interceptions. However, the NFL data only shows a link with one of these factors. If you want to predict interceptions, stay away from pass rush statistics and look at completion percentage.

How passing and rushing affect winning in the NFL

bill_belichickBefore the Super Bowl, Bill Belichick told his Giants defense to let Thurman Thomas rush for 100 yards.

As David Halberstam writes in Education of a Coach, it was a tough sell before the 1991 Super Bowl against Buffalo. The New York Giants played a physical defense that prided itself on not allowing 100 yard rushers.

No matter, the short, stout coach looked straight into the eyes of Lawrence Taylor and Pepper Johnson and said, “You guys have to believe me. If Thomas runs for a hundred yards, we win this game.”

Just in case his players didn’t listen, Belichick took it upon himself to ensure Thomas got his yards. He took out a defensive lineman and linebacker and replaced these large bodies with two defensive backs. In football lingo, the Giants played a 2-3-6 defense designed to struggle against the run.

Did Bill Belichick go insane? I certainly thought so when I first read this story years ago.

However, analytics is on Belichick’s side. Let me explain.

Visual shows the importance of passing over rushing

When it comes to winning in the NFL, passing is king. Rushing hardly matters.

To quantify this, our football obsessed culture must look past misleading statistics such as rush yards per game. Teams with the lead run the ball to take time off the clock. Any team can rush for 100 yards if they run it 50 times.

To measure true skill, it is better to look at efficiency metrics like yards per attempt. A team can’t fake their way to 5 yards per carry by running the ball more.

Here, efficiency for passing and rushing is defined as yards gained per attempt on offense minus yards allowed per attempt on defense. Higher values indicate more team strength. Sacks count as pass attempts, and these negative yards lower pass efficiency on offense.

The visual shows the pass and rush efficiency during the regular season for all NFL playoff teams from 2003 through 2012.

nfl_pass_rush

From the left panel, playoff teams excel in passing, both throwing the ball on offense and preventing the pass on defense. Only 15 of 120 playoff teams in this era allowed more yards per pass attempt than they gained.

The visual also highlights teams that played in the Super Bowl. Eight of the ten Super Bowl champions were among the NFL’s elite in pass efficiency. However, excellence in the air does not guarantee playoff success. The New York Giants in 2007 and Baltimore in 2012 won the Super Bowl despite subpar pass efficiency.

Rushing hardly matters in the NFL

While the importance of passing in the NFL will not surprise anyone, the insignificance of rushing might. The visual for rush efficiency shows playoff teams as a random scatter of positive and negative values for their regular season statistics. A strong run game on offense and defense does not help a team make the playoffs.

Moreover, teams with a high rush efficiency do not suddenly become clutch in the playoffs. Almost half of the teams that played in the Super Bowl allowed more yards per carry than they gained. In 2006, Indianapolis won the Super Bowl while having the worst rush efficiency in the NFL. Green Bay in 2010 and the New York Giants in 2011 weren’t much better.

A guessing game of a team’s wins

Running the ball does not affect winning as much as you think. To illustrate this point, consider this guessing game. Suppose you want to guess how many games a team will win during the regular season. Without any other data, it makes sense to guess 8, the average number of wins in a 16 game season.

From 2003 through 2012, this estimate would be wrong by 3.1 wins. In technical jargon, 3.1 is the standard deviation of actual wins from the guess of 8. In normal people language, it says 2 of 3 teams will be within 3.1 wins of the guess. About two thirds of NFL teams won between 5 and 11 games between 2003 and 2012.

With the rush efficiency for each team, how much better does your guess get? The right panel of the visual below shows how rush efficiency relates to wins for every NFL team from 2003 through 2012. Simple linear regression gives the best fit line through the data.

nfl_pass_rush_scatter

The regression line gives a new guess about the number of games a team will win. For example, suppose a team has a rush efficiency of 0.6 yards per carry. Instead of guessing 8 wins for this team, the line gives 8.7 wins for this team.

How much better are these new guesses? Not much. The error only drops from 3.1 wins to 3.03 wins. In technical jargon, rush efficiency explains only 4.4% of the variance in wins. You might as well guess randomly.

The results get better using pass efficiency, as shown in the left panel. The error in estimating wins drops from 3.1 to 1.96. Pass efficiency explains 62% of the variance in wins in the NFL. The strong relationship is clear from the visual.

In college football, rush efficiency correlates more strongly with wins than in the NFL. Teams like Alabama, Stanford and Wisconsin have won with a power running game and a physical front seven on defense. The insignificance of running the ball is unique to the NFL.

Analytics gives a broad view of how passing and rushing affect winning. But to dig deeper, let’s look at specific teams and their strengths in these areas.

Indianapolis Colts

Under the leadership of GM Bill Polian and QB Peyton Manning, the Colts had a remarkable run from 2003 through 2010. They won at least 12 games each year before slacking off with 10 wins in 2010.

They achieved success through the air, ranking in the top 8 in pass efficiency each year. Peyton Manning and his offense played the bigger role, but the pass defense helped out some years. The Colts ranked in the top 10 in pass defense (yards allowed per attempt) from 2007 through 2009.

However, Indianapolis was really bad in the run game. Only once in this era (2007) did they gain more yards per carry than they allowed. As mentioned before, they were dead last in the NFL in rush efficiency in 2006 when they beat Chicago in the Super Bowl.

New England Patriots

New England won 125 games, 2 Super Bowls and played in 2 others during the 10 seasons covered by the visual. They followed the same script as Indianapolis: strong in passing, weak in rushing.

From 2003 through 2012, New England ranked in the top 10 in pass efficiency in each year except 2008 and 2012. In 2008, QB Tom Brady got hurt in the first game of the season. New England ended the season 13th in yards gained per pass attempt and did not make the playoffs, the only time this happened during these 10 years.

However, New England has never cracked the top 10 in rush efficiency. Coach Bill Belichick might not have seen the data presented here, but he gets the futility of rushing in the NFL. This understanding extends as far back as his days as defensive coordinator for the Giants.

Indianapolis and New England have built their teams around passing at the expense of rushing. They, along with New Orleans of recent seasons, have had success in winning games and Super Bowls. Now let’s look at teams that excel at rushing.

Minnesota Vikings

More than any other team, the Vikings dominate the ground game. They feature RB Adrian Peterson on offense and have tackles Pat and Kevin Williams clogging up the middle on defense. For the 6 years between 2007 and 2012, Minnesota has finished 1st in rush efficiency 4 of those years.

However, this strength has led to ups and downs in wins. Minnesota went 3-13 in 2011 despite leading the NFL in rush efficiency. The next season, they led the NFL again behind a monster season from Peterson, who made a remarkable return from knee surgery. The Vikings had 10-6 record that season.

The Viking’s best season over this stretch came in 2009. They finished 12th in rush efficiency that season. The difference? A QB named Brett Farve came out of retirement to play for Minnesota. The Vikings finished 7th in yards gained per pass attempt. They went 12-4 and came within a late turnover against New Orleans of playing in the Super Bowl.

San Francisco

The Niners started winning games when coach Jim Harbaugh became coach in 2011. However, they had their strengths before he arrived. Behind DE Justin Smith and LB Patrick Willis, San Francisco had an elite run defense. From 2007 through 2012, they never finished worse than 8th in yards allowed per carry.

This run defense didn’t help them win much the first 4 seasons, as the Niners won only 26 games. The pass defense never finished better than 15th during this time.

When Harbaugh arrived in 2011, San Francisco drafted LB Aldon Smith, a pass rush monster out of Missouri. They also signed CB Carlos Rogers, who had his first Pro Bowl season in 2011. The Niners have finished 9th and 3rd in pass defense in 2011 and 2012 respectively. This resulted in 24 wins during these two seasons.

How to evaluate NFL statistics

In Super Bowl XXV, Bill Belichick’s plan to let Thurman Thomas rush for 100 yards worked, maybe too well. Against a small defense designed to slow down the pass, Thomas ran for 135 yards on 15 carries, a staggering 9 yards per carry. In the second half, he broke off a 31 yard run for a touchdown.

The game ended when Bills kicker Scott Norwood sent a field goal attempt wide right. The Giants won the Super Bowl 20-19.

The Giants did not win the game solely because of Belichick’s defensive plan. The offense generated two long scoring drives in the second half that took time off the clock. And I would bet my life savings Belichick did not want his defense to allow that 31 yard touchdown run to Thomas.

But, as Halberstam discusses in Education of a Coach, Belichick did want the Bills to pick up small gains on the ground if it meant keeping Jim Kelly from throwing the ball. He understood that rushing means almost nothing to winning in the NFL.

If you’re going to remember anything from this article, it should be this: look at a team’s passing instead of rushing numbers to determine whether they will win games.

How To Instantly Get Smarter About Your Team’s Next Game

I was frustrated with sports websites. Sure, all the big media sites have college football statistics. You can find rankings for all 124 bowl subdivision teams in many categories. If you need even more categories, check out cfbstats.com, which has everything from turnover margin to third down conversion against championship subdivision opponents.

But what I really wanted was how a team’s statistics compared with their next opponent. If Stanford has rushed for 3.7 yards per carry this season, I want to see that number next to 2.6, the yards allowed per carry by their next opponent USC. (I just tried to find these numbers on ESPN, and they don’t even have the defensive number for USC. I found it on cfbstats.com.) Of course, if these numbers were adjusted for strength of schedule, that would be even better.

So we started designing team pages that would show these opposing statistics next to each other. My friend Angi Chau had come up with a beautiful interactive visualization for the March Madness bracket in less than a week. How hard could it be to do the same for match up statistics?

Hard. We banged our heads against the wall for months. Finally, we came up with a solution that focuses on simplicity. It’s still not completely satisfying, since there’s a learning curve for the user.

However, we’ve reduced that learning curve to a minute. The front page of our premium college football product explains how and why better defenses appear further to the right in the visualization. Then, you can instantly get a feel for your team’s next game by looking at the team pages. These pages show our rushing and passing numbers that have been adjusted for strength of schedule. The opposing units are next to each other. Click on other teams in the schedule to see different match up statistics.

I love playing with these team pages. I hope you do as well. The front page has a menu for all of the team pages so far, which includes all big conference teams.

Thanks for reading.

How to Visualize Match Ups in College Football

An infographic on how we show match ups in college football

We’re working on how to visualize match ups in football. This graphic attempts to explain the bottom panel.

Does the graphic make sense?

All the numbers come from our ranking algorithm applied to the 2011 season.

For full passing and rushing ranking, click here.

Please leave us a comment.