Do you make these 3 mistakes with college football statistics?

boy_hitting_foreheadYou’re smarter than the average college football fan.

You crave a true understanding of your team and the game. Team rankings do not suffice. You need not only break down of offense and defense but also a further division into passing and rushing.

Numbers can help you in this journey, but only if you’re careful. College football statistics are tricky.

Moreover, the statistics on major media sites are deeply flawed. I never look at them.

Let me explain.

Why pace matters in football

College football provides a diversity of styles. Oregon uses an up tempo offense, which wears down the defense with a high frequency of plays. Copycats have sprouted up throughout the nation.

In contrast, offenses like Alabama and Stanford milk every second from the play clock before snapping the football. These offenses rely on a punishing rushing game.

Due to these differing styles, yards per game is a terrible metric to judge an offense. Up tempo teams like Oregon generate more yards in a game by running more plays.

This pace can also effect the defense. Since Oregon runs so many plays on offense, their defense tends to face more plays. This makes their yards allowed per game look bad.

You need a statistics that adjusts for the pace of play. In basketball, Dean Oliver popularized the idea of points per possession instead of points per game. In football, the easiest efficiency statistic is yards per play.

While yards per play works well to measure the strength of an offense or defense, college football statistics get more tricky when breaking down the passing and rushing game.

How to correctly evaluate passing and rushing

Sack count as rushing plays in college football.

It makes no sense. Plays that end in a sack started as a pass play. Those negative yards should count against passing yardage.

The inclusion of sacks as rushes probably originates from teams that run the option offense. The quarterback often rushes the ball by design. This makes it difficult to distinguish between a negative rushing play by the quarterback and a sack.

No matter the reasons for college football’s quirks, sacks should count as negative pass plays to evaluate rushing and passing. To my knowledge, no college football statistics site shows yards per play statistics with these adjustments for sacks.

To get the true rushing and passing efficiency, check out these yards per play statistics from The Power Rank. The numbers include both offense and defense.

The significance of strength of schedule

Armed with the best yards per play statistics for passing and rushing, you’re 95% of the way to understanding college football teams. However, to make the last leap, you must consider strength of schedule.

The SEC dominated college football during the latter part of the BCS era. A team like Mississippi State had the yearly misfortune of playing Alabama, LSU and Auburn, three teams that won 6 BCS national titles.

In contrast, the MAC barely survives as a Bowl subdivision conference. While Northern Illinois was a good team towards the end of the BCS era, they have the yearly fortune of facing Eastern, Central and Western Michigan.

Strength of schedule matters. There are many ways to adjust statistics like yards per play for strength of schedule. The Power Rank makes these adjustments through its ranking algorithm.

Members of The Power Rank have access to yards per carry and yards per pass attempt adjusted for strength of schedule. To learn more, enter your email and click “Sign up now.”








How Stanford in 2012 illustrates these common mistakes

To see the drastic effect these mistakes can have, let’s go back to the 2012 season. That year, Alabama pounded Notre Dame in the BCS championship game, while Stanford beat Wisconsin to win its first Rose Bowl in 41 years.

Stanford’s pass defense in 2012 provides an interesting case study for college football statistics. This unit featured a fierce pass rush from outside linebackers Chase Thomas and Trent Murphy. This pressure helped safety Ed Reynolds make 6 interceptions that season.

However, Stanford’s pass defense looked bad in the statistics on other college football sites. They allowed 239.2 yards per game, 72nd in the nation.

These typical statistics do not include negative plays from sacks. With the brilliance of Thomas and Murphy, Stanford sacked the quarterback on 9.1% of pass plays. Including these negative plays, Stanford allowed 214.7 yards per game, 59th in the nation.

One game really skews these pass defense statistics. Arizona QB Matt Scott threw for 474 yards through the air against Stanford. However, he attempted 72 pass attempts in that game. While allowing 474 yards seems bad, Arizona gained 6.58 yards per attempt, a little more than the 6.23 Bowl subdivision average.

For the season, Stanford allowed 4.96 yards per pass attempt, good for 10th in the nation.

Adjustments for strength of schedule make Stanford look better, since they faced strong pass offenses in their Pac-12 schedule. They ranked 3rd in The Power Rank for pass defense, predicted to allow 4.47 yards per attempt against an average Bowl subdivision defense.

The typical misleading college football statistics rate Stanford as the 72nd best pass defense. By properly accounting for pace and schedule strength, Stanford rockets up to 3rd and qualifies as an elite defense.

Check out The Power Rank’s yards per play numbers

Don’t get misled by the college football statistics on major media sites. Yards per game does not account for pace, and sacks count as rushes in these numbers.

The Power Rank provides rankings for yards per carry and pass attempt, both on offense and defense. These statistics count sacks as pass attempts. Use these free resources for your raw efficiency numbers.

Members of The Power Rank have access to these numbers adjusted for schedule strength. To learn more, sign up for my free email newsletter.

In addition, I’ll send you a pdf of my report The Football Analytics Resource Guide – The Top 5 Killer Articles. Just enter your best email and click “Sign up now.”








Baseball cluster luck article on FiveThirtyEight

Over the past few years, I’ve been calculating cluster luck in baseball. This is based on the idea that teams can score more runs when they cluster their hits together (or allow fewer runs when pitchers scatter hits).

However, teams can’t consistently cluster hits together. Cluster luck calculations show us which teams will not keep up their torrid early season pace.

Last week, Jonah Keri used my cluster luck numbers on FiveThirtyEight to show how this has happened San Francisco and closer Sergio Romero. Then he discussed how cluster luck continues to help Seattle but regression could hit soon.

Getting him the updated cluster luck numbers was simple, as I just use widely available season totals. However, creating the above graph was a lot of work, since it required box score numbers on a daily basis.

However, the work was worth it, as I’m cooking up a way to incorporate cluster luck into my MLB rankings. It should give us a better grasp on Oakland, a team that can’t possibly be more than 1.4 runs better than MLB average.

More on this soon.

To read Jonah Keri’s article on cluster luck based on my numbers, click here.

Television Interview on Ronan Farrow Daily

me_ronan_mauriceRonan Farrow interviewed me and Maurice Edu about the World Cup on his MSNBC show yesterday.

I was pumped to meet both the host and one of the best soccer players in the United States. However, there’s not much contact when they film you from a remote location.

To do the interview, I went to a studio in my home town of Ann Arbor. A nice guy Tony set everything up.

From a remote location, I could only hear Ronan and Maurice in my earpiece. I couldn’t see them or what appeared on television.

Still, it was fun. We talked World Cup, the United States’ chances against Belgium in the Round of 16 and how numbers affect the psychology of players.

To view the interview, click here. If you’re viewing on Tuesday, July 1, it should be the main video under “Betting on the World Cup.” Otherwise, you might have to scroll through the videos on the right.

Ensemble win probabilities for the World Cup after the group stage

wc2014_winprob_ensemble_aftergroupThe wisdom of crowds.

In making predictions, it’s best to include the predictions of many different methods. Each method has its strengths and weaknesses, and taking an average gives better results.

The Power Rank began when I developed an algorithm for ranking teams. However, as I learn more about making predictions, my method will take its place along with others in an ensemble of predictors.

Ultimately, I think this will be most useful in predicting the NCAA tournament. But right now, I’m practicing on the World Cup.

My recent article on bettingexpert looks at the ensemble predictions for the World Cup after the group stage. To check it out, click here.

New international football / soccer rankings show recent form of nations

world_soccer_June19_2014The FIFA rankings suck. Not only do they poorly predict the outcome of matches, but you have to wait a month for updates.

The Power Rank international football / soccer rankings do better. The ranking algorithm considers margin of victory in adjusting for schedule strength in international soccer. As an academic study has shown, using margin of victory is critical in making predictions.

In addition, the international rankings are now updated daily.

This constant updating is interesting during the World Cup. My rankings use a 4 year window of matches and weight matches by their importance.

  • World Cup Finals: 4.
  • World Cup Qualifiers, Confederations Cup, Continental Finals: 3.
  • Continental Qualifiers, 2.
  • Friendlies, 1.

Since we’re in the middle of a World Cup, the rankings add important matches each day while dropping results from the previous World Cup. This leads to some interesting changes for certain teams.

Spain and the Netherlands

The Netherlands dominated Spain in a 5-1 win last week. This dropped an aging Spain team down to 6th. The FIFA rankings still have Spain as the top team.

The Dutch have risen to 4th. It mystifies me why more people didn’t think this traditional power could win this World Cup.

Germany and Brazil

While most other respectable rankings have Brazil on top, the weighting of matches in The Power Rank vaults Germany ahead of Brazil.

Germany has played well in the last two World Cups. In 2010, they dominated Argentina in a 4-0 rout. Just last week, they beat Portugal, another top 10 team, by the same margin.

With no weighting, Brazil would be the top team in The Power Rank.

United States and Ghana

The Yanks are 18th currently, one spot above the Ghana squad they just beat.

The United States won the game because of two great finishes by Clint Dempsey and John Brooks. However, between these two goals, Ghana dominated possession and scoring opportunities. They were the better team.

Colombia and Chile

These two South American teams are in the top 10. Colombia is ranked higher at 5th, but Chile is not far behind at 7th.

From this World Cup, the Colombia looks like the better team. They continue to score goals despite the absence of Radamel Falcao, their leading scorer in qualifying.

Moreover, my aggregated win probabilities before the World Cup gave Colombia an almost 4% chance to win it all. Chile only had a 1.9% chance.

Belgium and France

Belgium has generated much chatter as a dark horse World Cup champion. Young players like Eden Hazard have dazzled on the pitch at this World Cup.

However, their performance over the last 4 years ranks them 13th in The Power Rank. That puts them lower than France (9th), a team no has talked about as World Cup champion. (Of course, France is missing star winger Frank Ribery for this World Cup.)

Belgium’s play as a team does not make me believe they will contend for the World Cup title. My aggregated win probabilities before the tourney agree with this assessment. Belgium had the 11th highest win probability at 2.3%.

Rankings of World Cup teams

Here are rankings of the 32 World Cup teams that consider matches from June 20, 2010 through June 19, 2014. The record gives wins, losses and ties over the past 4 years. The rating gives an expected margin of victory against an average international team.

1. Germany, (37-7-11), 2.52
2. Brazil, (40-9-12), 2.28
3. Argentina, (32-8-15), 2.15
4. Netherlands, (33-9-11), 2.09
5. Colombia, (24-8-11), 2.09
6. Spain, (45-8-8), 2.05
7. Chile, (29-17-9), 1.69
8. Uruguay, (28-14-15), 1.69
9. France, (28-11-12), 1.59
10. Portugal, (26-9-13), 1.54
11. Ecuador, (17-14-15), 1.48
12. Mexico, (35-18-17), 1.48
13. Belgium, (22-8-12), 1.44
14. England, (25-8-14), 1.43
15. Ivory Coast, (31-7-9), 1.42
16. Italy, (22-12-21), 1.40
17. Ghana, (30-15-14), 1.29
18. United States, (37-17-12), 1.25
19. Russia, (24-6-13), 1.25
21. Switzerland, (20-7-12), 1.23
23. Croatia, (24-10-11), 1.16
24. Nigeria, (29-11-21), 1.11
27. Japan, (33-12-13), 1.07
28. Bosnia-Herzegovina, (21-14-7), 1.03
30. Costa Rica, (25-23-19), 0.95
32. Greece, (24-8-16), 0.91
34. Australia, (26-16-11), 0.87
35. Iran, (30-8-16), 0.85
38. South Korea, (24-17-12), 0.80
43. Honduras, (22-24-18), 0.75
50. Cameroon, (16-13-12), 0.60
53. Algeria, (19-10-6), 0.56

For all teams, click here.

Predictions

The Power Rank also provides predictions for each match and stages of the competition, both of which are update nightly.

These predictions use a different set of rankings that consider a 12 year window of games. Research as shown that these calculations are as accurate in predicting match outcomes as using a 4 year window.