Will a 16 seed ever beat a 1 seed?

nova_morrisIt will happen. Maybe not this year, maybe not next year, but a 1 seed will fall to a 16 seed.

Just look at the last two seasons. Syracuse needed help from the referees two seasons ago to escape UNC Asheville. Southern took Gonzaga down to the wire before Kevin Pangos hit some shots to put the game away.

Numbers also support the claim that a 1 seed will topple in their first game. Let me explain.

Interactive bracket with win probabilities

Based on CBS’s bracket projection, I used my college basketball rankings to calculate the win probabilities for each team. This interactive bracket displays these 416 numbers, which include the chance that each team advances through each round.

Hover over each game to see the chance that each team has to win that game. Hover over each team to see their probability to advance through each round, as shown for Villanova in the visual.

Clearly, this bracket is more useful after Selection Sunday, a short 10 days from now. However, let’s look at the likelihood that any of the 4 one seeds go down.

The numbers behind the 1 vs 16 matchup

My win probabilities imply a 34.7% chance for 16 seed to beat a 1 seed in the CBS bracket projection.

This chance is too high. My win probabilities for games with big point spreads tend to favor the underdog too much. For example, Florida Gulf Coast has a 8.7% chance to upset Arizona. It’s probably more like 4% (there is no math behind this number).

However, consider the hypothetical world in which each 1 seed had a 99% chance to win over a 16 seed. The next 10 tourneys will feature 40 such games. To determine the chance that all 40 top seeds win, you take 0.99, multiple by 0.99, and do this another 38 times to account for all 40 games.

The answer? There’s a 66.9% chance all 40 teams win. This leaves a 33.1% chance that at least one 16 seed wins a game.

I have a long standing bet with a friend over this. We bet a six pack of beer over whether a 1 seed would lose to a 16 seed over the next ten years. (Neither of us remember when we made this bet, so the 10 years end after I win.)

Based on the 33.1% chance above, you may think I got the bad end of the deal. However, most 16 seeds have a much better than 1% chance to pull the upset. Even if you assume a 2% win probability, my chance of winning over the 10 year period goes up to 55%.

Can this be the year?

Robert Morris against Villanova

The CBS bracket pits 1 seed Villanova against 16 seed Robert Morris, a game that should excite any fan that lives all year for the upsets of the first 2 days of the tourney.

Villanova is overrated as a 1 seed, as my team rankings place them 8th. Moreover, they shoot 3 point shots on 46% of field goal attempts, 6th most in the nation. Live by the 3, die by the 3.

In addition, Robert Morris shoots 39% from 3 point range. If they get hot while Villanova goes cold, I could be collecting my six pack.

Of course, it’s too early to get excited with this mock bracket. Villanova could drop from a 1 seed if they lose early in the Big East tournament. Robert Morris must win their conference tournament to even make the field.

The likelihood of a 16 seed upset is higher if the best teams in the lower ranked conferences win their tournament. For example, if a team like Savannah State, ranked 294th, wins the Mid-Eastern conference tourney, then they would be a 16 seed in the tourney. A team that my numbers consider 7.5 points worse than average will not help my chances of a 1 seed upset.

But one of these years, a 16 seed will beat a 1 seed. You heard it here first.

Finally!! Interactive win probabilities for March Madness in February

int_bracket_Feb14_2014Do you find bracketology weird?

There’s an industry of people that predict how a committee will seed the NCAA men’s basketball tournament next month.

I get it. Fans want to know whether their team will make the tourney and where they’ll be seeded.

But isn’t the more important question how far your team will advance? Or whether they will win the tourney?

The odds of advancing in the tourney

For the last 2 years, The Power Rank has calculated the odds for each team to advance through the tourney. This starts with the college basketball team rankings, which imply a win probability for each game. The multiplication and addition of these numbers gives the odds for advancing through the rounds.

Usually, these results appear after the bracket is announced on Selection Sunday. However, I thought you might be interested in these results much earlier.

The interactive data visualization shows win probabilities based on ESPN’s bracket projection. Hover over a team to see their probability to advance through each round. Hover over a game to see the odds for each team to win the game. To play with the interactive visual, click here.

Let’s look at few features of these results.

A more competitive tourney

Arizona tops my rankings, and they appear as a one seed in the bracket prediction. However, they have only an 11% chance to win the tourney. To put this in perspective, Kentucky had 16.5% to win the tourney in 2012.

The lower win probability for Arizona results from their rating, or a projected margin of victory against an average college basketball team. Arizona has a rating of 17.8, tops in the nation. However, this rating is more than a point lower than Kentucky in 2012 as they headed into the tourney.

These ratings will change before the tourney starts, but it looks like a more competitive tourney than ever.

The value in Pittsburgh

Pittsburgh had Syracuse on the ropes at home on Wednesday night. They held a one point lead with 4 seconds remaining. But then Syracuse freshman Tyler Ennis hit a dazzling 35 foot shot to win the game.

Pittsburgh got robbed of a big win, sending them down to a 7 seed in this project bracket. However, the Panthers are 16th in The Power Rank. They have a 2% to win it all.

Almost no one will pick Pittsburgh as their champion in your pool. If Pittsburgh does actually win, you stand an excellent chance to win the pool, even if it has up to 100 entries.

However, Pittsburgh was also a good value pick last season, with a 1.5% win probability as an 8 seed. They lost in the first round to Wichita State.

Check out the interactive March Madness bracket

To play with the win probabilities, click here.

Also, I give advice on how to fill out your bracket when the actual bracket is announced. To get this information, sign up for my free email newsletter. Just enter your email address and click on “Sign up now”.








How to predict interceptions in the NFL, backed by surprising science

photoTurnovers play a critical role in football.

A tipped pass for an interception or crushing hit for a fumble can decide a close game. No coach emerges from a press conference without touting the importance of winning the turnover battle.

However, not all turnovers are created equal. In the 2013 NFL regular season, teams with more interceptions than their opponent won the game 80% of the time. Teams that forced and recovered more fumbles than their opponents won the game 70% of the time.

Interceptions have a bigger impact because the defender is most likely on his feet after the takeaway. This can lead to a big swing in field position or even a score. Defenders that recover fumbles tend to fall on the ball.

What factors affect interceptions in the NFL? Here, we’ll look at the surprising analytics behind interceptions.

You can do better than guessing that each team will throw picks on 2.9% of pass attempts, the NFL average. And it doesn’t involve an arcane statistic that comes from charting games. The critical numbers are in the box score, although it might not be the numbers you expect.

We’ll also look at how this analysis changes the predicted point spread for a game.

How pass rush affects interceptions

Seattle cornerback Richard Sherman led the NFL in interceptions in 2013. Despite all of his public claims about being the best cornerback in the league, Sherman credits his front seven for much of his success.

Pass rush is an obvious candidate to affect interceptions. The more often a defense applies pressure on the quarterback, the more often he throws an errant pass. Or perhaps the defender strikes the quarterback’s arm, causing a wobbly pass to fall into the hands of the defense.

To study this, we need to measure the strength of the pass rush. To start, let’s look at sacks, a number that requires proper context. A defense might rack up more sacks by facing more pass attempts. To account for this, let’s use sack rate, or sacks divided by the sum of pass attempts and sacks, as a measure of pass rush.

To determine whether pass rush causes interceptions, consider NFL defenses in the regular season from 2003 through 2013. While I expected defenses with a better sack rate to have a higher interception rate, there’s no correlation between these two quantities for these 352 defenses.

For those with a technical inclination, sack rate explains less than 1% of the variance in interception rate. For everyone else, check out the left panel of the visual in the next section.

Richard Sherman might be a great cornerback because of Seattle’s pass rush. However, his pass rush doesn’t explain his high interception total in 2013.

How pass protection affects interceptions

If pass rush has no effect on a defense’s interceptions, what about pass protection on offense? An offensive line that keeps pass rushers away from the quarterback might result in fewer interceptions.

Over the same 11 regular seasons, the sack rate allowed by an offense explains 6% of the variance in the interception rate. While this correlation is stronger than on defense, I still do not recommend using sacks to predict interceptions. The right panel of the visual shows why.

nfl_sack_pick_corr

We can dig even deeper into pass protection. Over the last 5 seasons, the NFL has tracked QB hits, or the number of times the quarterback gets hit after releasing the ball. We can now calculate the rate at which an offensive line allows the hits on the quarterback (the sum of QB hits and sacks divided by the sum of pass attempts and sacks).

This QB hit rate gives a better perspective on pass protection. An offensive line might look good because of a low sack rate. For example, Indianapolis gave up sacks on 5.2% of pass attempts in 2013, 5th best in the NFL.

However, this same offensive line allowed a hit rate of 23%, 26th worst in the NFL. Andrew Luck’s ability to get rid of the ball in the face of pressure played a big role in their low sack rate. The lack of protection probably also contributed to Luck’s below average completion percentage of 60% in 2013.

However, even a better statistic like QB hit rate doesn’t correlate with interceptions. Hit rate explains 4% of the variance in interception rate, a weaker correlation than shown in the right panel of the visual.

The data does not support the belief that pass rush affects interceptions. I would guess this comes from the ability of NFL quarterbacks to not let pressure to affect their accuracy. Of the thousands that play in high school and hundreds that make it to college, only 32 can play in the pros. These quarterbacks do not fold under pressure.

However, these 32 quarterback do vary in their accuracy, and that might impact interceptions.

How throwing accuracy affects interceptions

Despite the wobbles of the his balls, Peyton Manning has shown incredible precision with his throws. Over his career, he has completed 65.5% of his passes. Of active players, only Drew Brees and Aaron Rodgers have a better career completion percentage.

However, Peyton has gotten even better after having multiple neck surgeries. In his last two seasons with Denver, his completion percentage has increased to 68.4%.

Do more accurate quarterbacks throw fewer interceptions? Any fan would rather have Manning and Rodgers leading their offense than Derek Anderson or Brady Quinn. But are fewer interceptions a consequence of a better quarterback?

To answer this question, consider the career statistics for NFL quarterbacks in 2013 with at least 500 career pass attempts. The visual of these 52 players shows the negative correlation between completion percentage and interception rate.

nfl_qb

Peyton Manning is the third point from the right, and he has thrown picks at a higher rate than this regression analysis predicts. The outlier with the lowest interception rate is Nick Foles, the second year quarterback with Philadelphia. As much potential as he has shown, he will not continue to throw interceptions on 1.2% of his pass attempts. The same applies to San Francisco’s Colin Kaepernick, the point with the second lowest interception rate (1.7%).

This correlation does not imply that better accuracy causes fewer interceptions. But this conclusion does seem logical. The quarterback has control over where he throws the ball. The more control he shows, the less likely the ball hits the hands of a defender. There are better ways to look at this causation, but they will have to wait for another day.

For these quarterbacks, completion percentage explains 32% of the variance in interception rate. In the noisy world of football statistics, that’s as strong a relationship as you will see between two statistics. In addition, the correlation also exists for the regular season statistics of offenses from 2003 through 2013. Here, completion percentage explains 20% of the variance in interception rate.

With this strong relationship between accuracy and interceptions, how can we modify a point spread prediction for a game?

How interceptions affect the point spread

To use these results to adjust a prediction, let’s look back at the Super Bowl between Seattle and Denver at the end of the 2013 season. Before the game, the team rankings at The Power Rank predicted Seattle by 1.3 points, which implied a 46% chance for Denver to win.

Denver had a lower likelihood to throw a pick based on Peyton Manning’s accuracy. On average, NFL quarterbacks throw interceptions on 2.9% of pass attempts. With Peyton’s 65.5% career completion percentage, the regression model predicted he would throw interceptions on 2.56% of pass attempts. For a league average 35 pass attempts, this meant 0.14 fewer interceptions for the game.

While such a small fraction of picks might seem inconsequential, the impact of such a turnover makes it matter. From the relationship between interceptions and points in NFL games, the average interception is worth about 5 points. This changed the predicted point spread by 0.7 points in Denver’s favor. Seattle’s predicted margin of victory dropped from 1.3 to 0.6, which increased Denver’s win probably from 46% to 48%.

The game didn’t go Denver’s way. Seattle’s defenders knew what mouthwash Manning used before the game since they spent the entire game in the backfield.

The outcome of interceptions in the Super Bowl

Manning thew 2 interceptions. The first was an errant pass that landed in the hands of Cam Chancellor, a play in which Manning wasn’t pressured that heavily. The second pick came when a defender hit his arm on a throw. The football wobbled into the hands of Malcolm Smith, who ran for a Seattle touchdown.

For the game, Manning thew 49 passes, so the two picks implied a 4.1% interception rate. Even with this small sample size, that is not an outrageous rate. If not for the bad luck on the pick in which his arm got hit, Manning would have had a 2% rate.

Common sense says that pass rush and throwing accuracy affect interceptions. However, the NFL data only shows a link with one of these factors. If you want to predict interceptions, stay away from pass rush statistics and look at completion percentage.

Do you make this mistake in predicting NFL games?

blountAfter last weekend’s Division round of playoff games, former Super Bowl winning coach Brian Billick proclaimed rushing the ball still matters in the NFL.

Who could disagree after those 4 games? New England ran for a startling 234 yards against Indianapolis, as LeGarrette Blount ripped off a 73 yard touchdown run to seal the game.

Seattle rushed for 174 yards against New Orleans, as Marshawn Lynch again terrorized the Saints defense in the playoffs.

San Francisco and Denver, the other two winners, also rushed for 100 yards while their opponents didn’t.

Rushing hardly matters in the NFL

However, as a smart sports fan, you know better than to draw conclusions after 4 playoff games. The sample size is too small.

Moreover, rush yards per game is a misleading statistic. New England ran the ball 46 times in amassing 234 yards. It’s better to consider yards per carry in judging how well a team runs the ball or stops the run.

In a previous article, I looked at the pass and rush efficiencies for 10 years of NFL playoff teams. To capture team strength in these two areas, efficiency is defined as yards gained per attempt on offense minus yards allowed per attempt on defense. The visual shows regular season numbers for NFL playoff teams from 2003 through 2012.

nfl_pass_rush

For rush efficiency, the visual shows NFL playoff teams as a random selection of positive and negative values. Unlike college football, rush efficiency has almost no correlation with winning in the NFL.

Moreover, teams with high rush efficiency do not play better in the playoffs once they get there. Almost half of the teams that played in the Super Bowl gave up more yards per carry than they gained.

Passing is a different story. Most NFL playoff teams had a positive pass efficiency, and 8 of 10 Super Bowl champions had some of the top values in the NFL.

Did these trends hold up in the 2013 season?

Pass and rush efficiency in 2013

Here are the same rush efficiencies for the 2013 season. Playoff teams are highlighted by links that take you to their team page at The Power Rank.

1. Philadelphia, (10-6), 1.37
2. New York Jets, (8-8), 1.02
3. Minnesota, (5-10-1), 0.93
4. Washington, (3-13), 0.78
5. Oakland, (4-12), 0.67
6. San Francisco, (12-4), 0.49
7. Seattle, (13-3), 0.45
8. St. Louis, (7-9), 0.37
9. Carolina, (12-4), 0.24
10. Denver, (13-3), 0.19
11. Kansas City, (11-5), 0.16
12. Cleveland, (4-12), 0.12
13. Tennessee, (7-9), 0.07
14. Green Bay, (8-7-1), 0.02
15. Miami, (8-8), -0.00
16. Arizona, (10-6), -0.00
17. New England, (12-4), -0.07
18. Houston, (2-14), -0.10
19. Buffalo, (6-10), -0.15
20. Tampa Bay, (4-12), -0.18
21. Detroit, (7-9), -0.21
22. Indianapolis, (11-5), -0.21
23. Dallas, (8-8), -0.23
24. New York Giants, (7-9), -0.34
25. Cincinnati, (11-5), -0.36
26. San Diego, (9-7), -0.54
27. Baltimore, (8-8), -0.69
28. Pittsburgh, (8-8), -0.76
29. Chicago, (8-8), -0.82
30. Jacksonville, (4-12), -0.82
31. New Orleans, (11-5), -0.85
32. Atlanta, (4-12), -0.89

5 of 12 of the playoff teams, led by New Orleans at 31st, appear in the bottom half of these rankings. The top 5 includes Minnesota, Washington and Oakland, teams that gave their fans major indigestion this season.

Here are the numbers for pass efficiency.

1. Seattle, (13-3), 2.13
2. Cincinnati, (11-5), 1.62
3. Denver, (13-3), 1.61
4. New Orleans, (11-5), 1.56
5. Arizona, (10-6), 0.96
6. Philadelphia, (10-6), 0.86
7. San Francisco, (12-4), 0.86
8. Pittsburgh, (8-8), 0.51
9. San Diego, (9-7), 0.44
10. Carolina, (12-4), 0.39
11. Detroit, (7-9), 0.29
12. New York Giants, (7-9), 0.28
13. Green Bay, (8-7-1), 0.15
14. Buffalo, (6-10), 0.15
15. New England, (12-4), 0.12
16. Chicago, (8-8), 0.10
17. Cleveland, (4-12), 0.05
18. Tennessee, (7-9), -0.08
19. Indianapolis, (11-5), -0.35
20. Houston, (2-14), -0.40
21. Kansas City, (11-5), -0.51
22. Miami, (8-8), -0.56
23. Dallas, (8-8), -0.62
24. New York Jets, (8-8), -0.73
25. Baltimore, (8-8), -0.84
26. Minnesota, (5-10-1), -0.86
27. St. Louis, (7-9), -0.98
28. Oakland, (4-12), -1.04
29. Atlanta, (4-12), -1.08
30. Washington, (3-13), -1.35
31. Jacksonville, (4-12), -1.45
32. Tampa Bay, (4-12), -1.51

Of the 12 playoff teams, only Kansas City and Indianapolis do not rank in the top half for pass efficiency. All but two of the top 10, Arizona and Pittsburgh, made the playoffs.

Passing dominates the NFL. Rushing hardly matters.

How will this impact this weekend’s championship games?

What would Belichick do?

New England travels to Denver as a 6 point underdog in the AFC championship game. Vegas doesn’t believe in this Peyton Manning will choke in the playoffs stuff.

Does New England have a chance? I think so. Let me explain.

In their regular season meeting in New England, Bill Belichick played a defense geared towards stopping Denver’s aerial attack. Even when down 24 points at one point in the game, Belichick only had 6 defenders in the box to defend the run. Instead, New England played 5 defensive backs, two further than 10 yards from the line of scrimmage at the beginning of the play.

Against this defense, Denver ran the ball 48 times for absurd 5.83 yards per carry (NFL teams have averaged 4.1 yards over the last 10 years). Manning only threw for 3.47 yards per attempt on 38 pass plays.

It’s difficult to say that Belichick’s defensive plan won the game for New England. There were so many turnovers and fluky plays on both sides that finally allowed New England to win in overtime. But New England’s defense did shut down Denver’s passing attack.

From watching the game, New England’s secondary had an amazing game, especially CB Aqib Talib on WR Demaryius Thomas. It’s unlikely they’ll play that well again.

I give Denver 67% chance to win this game.

The return of Carlos Rogers

Before the 2011 season, San Francisco had a weak pass defense. They fixed this by signing CB Carlos Rodgers as a free agent and drafting pass rush LB Aldon Smith. In the 3 last years, San Francisco has been a top 10 pass defense by yards allowed per attempt.

However, Rogers missed the last two playoff games with a hamstring injury. San Francisco’s defense held up in the cold of Green Bay, holding QB Aaron Rodgers to 5.23 yards per attempt, less than the league average of 6.1. However, Carolina QB Cam Newton threw for 7.73 yards per attempt last weekend.

If Rogers can be effective coming off an injury, San Francisco has a much better chance to contain Seattle’s offense. However, the numbers suggest a win for Seattle, probably by 5 points.

How to predict NFL games

When running backs such as LeGarrette Blount and Marshawn Lynch run for so many yards, it’s easy to get fooled into thinking rushing matters in winning playoff games.

But the numbers simply do not support that claim. To think about this another way, New England rushed for 5.09 yards per carry, which includes Blount’s 73 yard touch down run. NFL teams average 6.1 yards per pass attempt, including lost yards on sacks.

This doesn’t mean that rushing has no place in the NFL. Deception is a key factor in sports, and I believe run plays can set up play action fakes that result in long pass completions.

But do yourself a favor. Don’t look at rushing numbers like yards per carry when making NFL predictions. Instead, focus on passing numbers. For more, click here.

Check out the Kaggle competition for March Madness

Do you love sports and numbers? Do some number crunching of your own?

I’ve been helping Kaggle, a company that makes data science into a sport, put on a competition to predict the most random of all sports: March Madness.

It’s a wealth of awesome data. It includes regular season and tournament results for every season since 1995-1996. Ever want to apply your ranking method to see whether the 1998-1999 Duke team was the best of all time? Well, here’s a clean data set.

The ultimate goal is to predict the results of this year’s NCAA men’s basktball tournament.

To check out the competition, click here.

Want a good laugh?

Amy Nelson of SB Nation made a video about my March Madness analytics in 2011. Check out the first minute of the video.

It’s pointless to attempt to predict March Madness, right? Wow, sports analytics has an uphill battle in educating fans.