Cluster luck numbers for the 2014 MLB regular season

To explain cluster luck, my Grantland colleague Jonah Keri wrote the following.

Joe Peta, a former Wall Street trader, presented cluster luck in his book, Trading Bases. Essentially, the concept boils down to this: When a team’s batters cluster hits together to score more runs and a team’s pitchers spread hits apart to allow fewer runs, that’s cluster luck. Say a team tallies nine singles in one game. If all of those singles occur in the same inning, the team would likely score seven runs; if each single occurs in a different inning, however, it’d likely mean a shutout.

Here are the numbers for cluster luck for the 2014 regular season. For each team, it shows total (offense, defense) for cluster luck. In all cases, a positive number implies good luck, or scoring more runs than expected on offense or allowing fewer runs on defense.

1. New York Mets, 57.29. (9.00, 48.29).
2. Seattle, 48.63. (27.67, 20.97).
3. Cincinnati, 41.94. (12.04, 29.90).
4. Baltimore, 40.00. (-13.53, 53.54).
5. Oakland, 39.71. (44.23, -4.51).
6. Kansas City, 31.46. (16.84, 14.61).
7. Texas, 14.93. (7.64, 7.30).
8. San Diego, 13.98. (-4.81, 18.78).
9. Los Angeles Angels, 11.01. (44.40, -33.39).
10. Minnesota, 9.83. (14.90, -5.08).
11. Toronto, 8.99. (-12.04, 21.04).
12. Washington, 5.12. (-9.92, 15.04).
13. Philadelphia, 4.87. (7.44, -2.58).
14. Atlanta, 4.47. (-29.24, 33.71).
15. Miami, 0.88. (-17.45, 18.33).
16. Boston, -0.39. (-13.36, 12.97).
17. St. Louis, -0.59. (-9.22, 8.63).
18. San Francisco, -1.06. (7.16, -8.22).
19. Milwaukee, -1.29. (-9.09, 7.80).
20. Cleveland, -6.12. (-17.96, 11.84).
21. Detroit, -7.78. (-14.46, 6.68).
22. New York Yankees, -10.36. (-5.51, -4.86).
23. Los Angeles Dodgers, -14.87. (-19.74, 4.87).
24. Arizona, -15.48. (-5.73, -9.75).
25. Colorado, -28.07. (-29.56, 1.49).
26. Chicago White Sox, -28.85. (-10.81, -18.04).
27. Pittsburgh, -40.53. (-42.80, 2.26).
28. Houston, -44.07. (-19.16, -24.90).
29. Tampa Bay, -48.93. (-30.20, -18.73).
30. Chicago Cubs, -67.81. (-20.64, -47.17).

Cluster luck is the deviation of actual runs from Base Runs, the runs created formula of Dave Smyth. The difference in runs scored and runs allowed by Base Runs provides a way to rank teams. The results below give run differential (runs scored, runs allowed). The record denotes a Pythagorean expectation with an exponent of 1.83.

1. Los Angeles Angels, 131.99. (728.60, 596.61). Record: 95-67.
2. Washington, 125.88. (695.92, 570.04). Record: 95-67.
3. Oakland, 117.29. (684.77, 567.49). Record: 94-68.
4. Los Angeles Dodgers, 115.87. (737.74, 621.87). Record: 93-69.
5. Pittsburgh, 91.53. (724.80, 633.26). Record: 90-71.
6. Baltimore, 72.00. (718.53, 646.54). Record: 88-74.
7. Detroit, 59.78. (771.46, 711.68). Record: 86-76.
8. San Francisco, 52.06. (657.84, 605.78). Record: 86-75.
9. Tampa Bay, 35.93. (642.20, 606.27). Record: 85-77.
10. Seattle, 31.37. (606.33, 574.97). Record: 84-78.
11. Toronto, 28.01. (735.04, 707.04). Record: 83-79.
12. Cleveland, 22.12. (686.96, 664.84). Record: 82-79.
13. St. Louis, 16.59. (628.22, 611.63). Record: 82-80.
14. Kansas City, -4.46. (634.16, 638.61). Record: 79-82.
15. Milwaukee, -5.71. (659.09, 664.80). Record: 80-82.
16. New York Yankees, -20.64. (638.51, 659.14). Record: 78-84.
17. Chicago Cubs, -25.19. (634.64, 659.83). Record: 77-84.
18. Atlanta, -28.47. (602.24, 630.71). Record: 77-85.
19. Miami, -29.88. (662.45, 692.33). Record: 77-85.
20. Colorado, -34.93. (784.56, 819.49). Record: 77-84.
21. New York Mets, -46.29. (620.00, 666.29). Record: 75-87.
22. Houston, -49.93. (648.16, 698.10). Record: 75-87.
23. San Diego, -55.98. (539.81, 595.78). Record: 73-89.
24. Cincinnati, -58.94. (582.96, 641.90). Record: 73-89.
25. Chicago White Sox, -69.15. (670.81, 739.96). Record: 73-89.
26. Minnesota, -71.83. (700.10, 771.92). Record: 73-89.
27. Philadelphia, -72.87. (611.56, 684.42). Record: 72-90.
28. Boston, -80.61. (647.36, 727.97). Record: 72-90.
29. Arizona, -111.52. (620.73, 732.25). Record: 68-94.
30. Texas, -150.93. (629.36, 780.30). Record: 65-97.

For my analysis of how cluster luck will affect certain teams in the playoffs, check out my article on

Check out this must read football analytics article

Football's corner 3What is the most efficient play in football? What is the analogue of basketball’s corner 3 point shot?

This question bothered Robert Mays of Grantland, so he enlisted the help of the quants at ESPN. They found ample evidence that the play action pass is the most efficient play.

To show this, they looked at expected points. Given a down, distance to a first down and field position, expected points is the average net points of the next score.

From 1st and 10 from their own 20, the offense might score a touchdown for +7 points. The offense might also punt, which leads an opponent field goal and -3 points. Expected points averages these outcomes to assign each situation a point value.

Expected points added (EPA) is the change in expected points on a given play. This statistic acknowledges that 2 yards on 3rd and 1 is worth more than 2 yards on 1st and 10.

Mays and the ESPN quants found that the play action pass earned the highest EPA of all plays. And it wasn’t even close. Running plays lost expected points on average (-0.04 EPA), while passes averaged +0.04 EPA. The play action gained +0.17 on average, 4 times more than the typical pass.

Deception matters in football. On a play action pass, the offense fakes a run, which freezes the linebackers. This frees up space down the field for a big pass play.

In college football, I’ve also found that offenses that run the ball well like Auburn in 2013 can throw effectively on 1st and 10. The defense presumably expects a run, which makes it easier to throw for a big gain.

Moreover, the data suggests that an NFL team doesn’t need a good run game to be effective with play action. For example, Minnesota had a strong rush attack with Adrian Peterson. However, the Vikings were only 21st in play action EPA over the last 4 years.

Play action passes are pass plays, and a team needs a good quarterback to make those throws. The top teams in play action efficiency have quarterbacks like Aaron Rodgers and Peyton Manning.

To check the article by Mays on football’s corner 3, click here.

The top 5 killer articles on football analytics

helmet_cover_391_289Do you want to get up to speed on football analytics? I’ve compiled 5 of my favorite articles in this free report.

To download this pdf, just sign up for my free email newsletter. (I promise, no spam. Just good content from yours truly.) Enter your email and click on “Sign up now.”

How to use baseball analytics for a profitable sports investment

true_oddsDo you bet on baseball? Are you looking for an extra edge based on data and analytics?

Onside Sports has new solution. While they launched as a social sports app last year, they have now developed True Odds, a data driven prediction system for baseball. True Odds, an in-app purchase, has a 298-271-10 record this season through September 9th.

I had the opportunity to talk with Kai Yu, the brains behind True Odds. While he obviously could not tell me everything about his methods, he did share quite a bit, which I’ll share in this post.

If you’re eager to get a free trial of their picks, click here and use the code THEPOWERRANK.

Baseball from its fundamental interaction

True Odds starts with the matchup between pitcher and batter. Based on historical data, it seeks to estimate probability of an event such as Miguel Cabrera’s hitting a home run off James Shields.

As part of this analysis, Kai had to carefully sort out which variables predict the future and which variables tend towards randomness. He noted contact rate as an import skill for a hitter. It’s tough to strike out Victor Martinez no matter who pitches to him.

This bottom up approach has advantages over the top down approach that looks at overall team performance. Often times, this top down approaches looks at a team’s runs scored and allowed. However, these numbers can be greatly affect by the sequencing of hits, or cluster luck. Combining pitcher batter matchups with the simulation method below does not have these problems.

Random simulations

Based on the probabilities from every pitcher batter matchup, True Odds uses a random simulation to play the game many times. Each simulation is different, and a set of simulations gives the probability that certain events happen, such as a Detroit win over San Francisco or a total of more than 7 runs for Oakland and Seattle.

To accurately simulate a game, True Odds must know both the pitcher and the opposing line up. This method naturally accounts for injuries.

Other quants have also used pitcher batter matchups and random simulations to profit on baseball. For example, check out this excellent Q&A with David Frohardt-Lane on Regressing, Deadspin’s sports data blog.

A multitude of other factors

Kai also stressed the importance of other factors, such as park, weather and umpires. True Odds incorporates these factors in predicting the outcomes of games.

Let’s discuss umpires, who can impact home field advantage. As Jon Wertheim and Tobias Moskovitz discussed in their book Scorecasting, umpires tend to call more strikes on road than home batters. This tendency increases in high leverage situations, such as two outs with the bases loaded in a close game in the bottom of the ninth.

However, umpires might not play as big a role in home advantage anymore. Through September 8th, home teams have scored a mere 49 more runs than road teams. This 0.02 runs per game is much lower than the historical average.

Major League Baseball might be keeping a more watchful eye on umpires with cameras. I bet True Odds has a grasp on this.

Does FIP apply to every pitcher?

The most interesting part of my conversation with Kai concerned whether fielding independent pitching applied to every pitcher.

To recap, fielding independent pitching comes from the research of Voros McCracken, who discovered that pitchers do not affect batting average on balls in play (BABIP). Pitchers have control over their strike outs, walks and home runs allowed. However, 30% of balls hit in play become hits, and deviations from this average for a pitcher strongly regress to the mean.

This research led to the development of FIP, a runs allowed statistic that only considers strike outs, walks and home runs. It should replace ERA in any discussion of pitching performance.

However, Kai suggested FIP doesn’t apply to all pitchers. He cited Seattle’s Chris Young as a pitcher who consistently has a lower ERA than FIP. This reminds me of an excellent analysis of Mark Buehrle and how his defense makes him a better pitcher than FIP suggests.

Try out True Odds for free

Onside Sports has done a remarkable job using data to find value in the baseball market. Their predictions have registered 298 wins, 271 losses and 10 pushes for a return on investment of 11% through September 9.

As a reader of The Power Rank, you can try out True Odds for free. Follow the steps under this video and use the code THEPOWERRANK. With only 4 weeks left before the baseball postseason, check it out today.

How to rank NFL teams in the preseason

nfl2014_preseasonYou want to know the strength of your NFL team. You’ll take any analytics that can sort through the preseason noise of the NFL.

In college football, team strength tends to persist from year to year. This makes it possible to use previous seasons to predict the current season.

However, looking at past years does not work in the NFL since team performance regresses to the mean. The salary caps levels the playing field for all 32 teams. Injuries and luck can derail teams with the highest expectations, such as Atlanta in 2013.

However, we can use a different trick from college sports to rank NFL teams in the preseason. Let me explain.

Wisdom of many sports writers

Preseason polls in college sports are remarkable predictors of success.

I first learned about this counter intuitive result from Nate Silver, who uses the preseason AP college basketball poll in his NCAA tournament predictions.

The same accuracy holds for college football polls. In the preseason AP poll that gets released before the season, the higher ranked teams win 59.5% of bowl games that postseason, a result based on 300 bowl games since the 2005 season. The preseason Coaches poll has predicted a more remarkable 61.2% of bowl game winners in the same time span.

The combined wisdom of sports writers or coaches lead to remarkable rankings. However, the accuracy of these polls decrease once the season starts. The writers or coaches tend to react too strongly to wins and losses. By the end of the season, the higher ranked team in the AP polls wins 56% of bowl games.

However, the AP poll is a remarkable tool before the season starts. Let’s created the same type poll for the NFL.

Ensemble NFL preseason rankings

Every major sports media site publishes preseason power rankings. I looked at 20 from before the 2013 season.

A team’s rank isn’t enough to make game predictions. We also need a team’s rating, which gives an expected margin of victory over an average NFL team.

To do this, I took a team’s rank and assigned it a rating based on historical results from The Power Rank. For example, the top ranked team had a 9.7 rating from 2003 through 2012.

Now for each subjective power ranking, a team gets both a rank and a rating, just like the rankings here on The Power Rank. To get ensemble preseason rankings, a team’s rating is averaged over the 20 subjective power rankings. Here are the results for 2013 along with the team’s final record after the playoffs.

1. San Francisco, (14-5), 8.43.
2. Seattle, (16-3), 7.97.
3. Denver, (15-4), 7.35.
4. Atlanta, (4-12), 5.65.
5. Green Bay, (8-8-1), 5.29.
6. Baltimore, (8-8), 4.92.
7. New England, (13-5), 4.89.
8. Houston, (2-14), 4.38.
9. Cincinnati, (11-6), 3.04.
10. Washington, (3-13), 2.25.
11. New York Giants, (7-9), 1.58.
12. New Orleans, (12-6), 1.34.
13. Chicago, (8-8), 1.33.
14. Indianapolis, (12-6), 1.05.
15. Pittsburgh, (8-8), 0.18.
16. Dallas, (8-8), 0.15.
17. Minnesota, (5-10-1), 0.02.
18. St. Louis, (7-9), -0.79.
19. Carolina, (12-5), -1.35.
20. Miami, (8-8), -1.56.
21. Tampa Bay, (4-12), -1.95.
22. Detroit, (7-9), -1.99.
23. Kansas City, (11-6), -2.52.
24. Philadelphia, (10-7), -3.65.
25. Cleveland, (4-12), -4.11.
26. San Diego, (10-8), -4.20.
27. Tennessee, (7-9), -4.42.
28. Arizona, (10-6), -4.46.
29. Buffalo, (6-10), -5.11.
30. New York Jets, (8-8), -6.80.
31. Jacksonville, (4-12), -8.17.
32. Oakland, (4-12), -8.74.

Clearly, the preseason ensemble rankings thought too highly of Atlanta and Houston, two teams that combined for 6 wins in 2013. On the other end, the ensemble rankings missed low on San Diego and Arizona.

However, the rankings had the final four teams in the playoffs (Seattle, Denver, San Francisco and New England) in the top 10. In addition, they did predict 62.5% of game winners over the 2013 regular season and playoffs. The Vegas line gets 66% of games correct on average.

Preseason ensemble rankings for 2014

Here are results for the 2014 season.

1. Seattle, 9.61.
2. Denver, 7.96.
3. New Orleans, 6.58.
4. New England, 6.40.
5. San Francisco, 6.13.
6. Green Bay, 6.07.
7. Philadelphia, 3.72.
8. Indianapolis, 3.42.
9. Cincinnati, 3.10.
10. Chicago, 2.03.
11. San Diego, 1.85.
12. Arizona, 1.20.
13. Baltimore, 1.01.
14. Pittsburgh, 0.78.
15. Carolina, 0.39.
16. Kansas City, 0.30.
17. Detroit, -0.03.
18. Atlanta, -0.99.
19. St. Louis, -1.52.
20. Tampa Bay, -1.63.
21. Miami, -2.13.
22. New York Jets, -2.40.
23. New York Giants, -2.41.
24. Dallas, -3.44.
25. Washington, -3.78.
26. Minnesota, -3.95.
27. Tennessee, -4.60.
28. Houston, -5.56.
29. Jacksonville, -6.02.
30. Cleveland, -6.43.
31. Buffalo, -6.98.
32. Oakland, -8.69.

The rankings in the ensemble differed the most on Houston. The Texans have lots of question marks on offense with new QB Ryan Fitzpatrick. However, they had a decent defense last season and added top draft pick Jadaveon Clowney to a pass rush that already features J.J. Watt. Houston is 28th in the preseason rankings.

The rankings in the ensemble differed the least on Philadelphia. From all the Chip Kelly is a genius articles out there, everyone thinks the Eagles are a solid top 10 team. Philadelphia is 7th in the preseason rankings.

However, the Eagles have issues to worry about. QB Nick Foles can’t possibly throw interceptions at a lower rate than he did last season. Moreover, the pass defense finished 23rd last season in my yards per pass attempt adjusted for schedule stat.

How well do the predictions compare with the line?

We can check how closely the predictions from these rankings compare with the line. Here are the predictions for week 1. The games are ranked by the strength of the two teams and expected closeness of the outcome.

1. Green Bay at Seattle. (0.61)
Seattle (1) will beat Green Bay (6) by 6.1 at home. Green Bay has a 34% chance of beating Seattle.

2. Indianapolis at Denver. (0.53)
Denver (2) will beat Indianapolis (8) by 7.1 at home. Indianapolis has a 31% chance of beating Denver.

3. San Diego at Arizona. (0.52)
Arizona (12) will beat San Diego (11) by 1.9 at home. San Diego has a 45% chance of beating Arizona.

4. Cincinnati at Baltimore. (0.50)
Baltimore (13) will beat Cincinnati (9) by 0.5 at home. Cincinnati has a 49% chance of beating Baltimore.

5. Carolina at Tampa Bay. (0.43)
Tampa Bay (20) will beat Carolina (15) by 0.6 at home. Carolina has a 48% chance of beating Tampa Bay.

6. New York Giants at Detroit. (0.40)
Detroit (17) will beat New York Giants (23) by 5.0 at home. New York Giants has a 36% chance of beating Detroit.

7. New Orleans at Atlanta. (0.38)
New Orleans (3) will beat Atlanta (18) by 5.0 on the road. Atlanta has a 36% chance of beating New Orleans.

8. Minnesota at St. Louis. (0.36)
St. Louis (19) will beat Minnesota (26) by 5.0 at home. Minnesota has a 36% chance of beating St. Louis.

9. New England at Miami. (0.34)
New England (4) will beat Miami (21) by 5.9 on the road. Miami has a 34% chance of beating New England.

10. Washington at Houston. (0.32)
Houston (28) will beat Washington (25) by 0.8 at home. Washington has a 48% chance of beating Houston.

11. Tennessee at Kansas City. (0.31)
Kansas City (16) will beat Tennessee (27) by 7.5 at home. Tennessee has a 30% chance of beating Kansas City.

12. San Francisco at Dallas. (0.29)
San Francisco (5) will beat Dallas (24) by 7.0 on the road. Dallas has a 31% chance of beating San Francisco.

13. Cleveland at Pittsburgh. (0.23)
Pittsburgh (14) will beat Cleveland (30) by 9.8 at home. Cleveland has a 25% chance of beating Pittsburgh.

14. Jacksonville at Philadelphia. (0.22)
Philadelphia (7) will beat Jacksonville (29) by 12.3 at home. Jacksonville has a 20% chance of beating Philadelphia.

15. Buffalo at Chicago. (0.20)
Chicago (10) will beat Buffalo (31) by 11.6 at home. Buffalo has a 22% chance of beating Chicago.

16. Oakland at New York Jets. (0.18)
New York Jets (22) will beat Oakland (32) by 8.9 at home. Oakland has a 27% chance of beating New York Jets.

The predictions differ the most from the line in games with really bad teams. For example, the preseason rankings predict a 10 point win for Pittsburgh over 30th ranked Cleveland. The line only favors Pittsburgh by 6.5.

There are similar differences for games with Oakland, Buffalo and Jacksonville. It seems like the markets do not want to put down the worst teams in the NFL.

Members of The Power Rank have access to these predictions for all 256 games of the NFL season. To learn more, click here.

Check out the new NFL team page

Screen shot 2014-08-29 at 3.59.30 PMYards per play is a great college football statistic. This efficiency metric is simple to calculate and mostly immune from the randomness of turnovers. Dr. Bob has been using it in his college football handicapping for years.

When I started to apply my methods to the NFL last season, I thought yards per play would play the same role. I was wrong.

Let me explain.

What matters in winning football games?

To determine the significance of a statistic, you can look at how well it correlates with winning. For typical efficiency stats such as yards per play, I take the quantity on offense minus the same quantity on defense. Since stronger defenses allow fewer yards per play, this difference describes overall team strength.

I looked at these correlations in the NFL from 2004 through 2013 regular seasons. Yards per play explains 50.4% of the variance in winning, making it an important statistic. However, yards per pass attempt is even more significant (56.2% of the variance in winning). In contrast, yards per carry contributes almost nothing to winning (5% of variance).


For these 3 efficiency metrics in college football, yards per play explains the most variance in winning.

Even though NFL teams rush on about 40% of plays, these plays are noise when it comes to winning. Passing dominates the NFL. Hence, I use yards per pass attempt as the primary statistic for offense and defense in the NFL. Check out this sample team page for Philadelphia.

All 32 interactive team pages will be available for members next week. To learn about becoming a member, click here.

By the way, those are actual preseason rankings for NFL teams, not the lame end of last season crap I usually have at this point of the NFL season. Look for more details about these rankings next week.

To make sure you hear about this content, sign up for my free email newsletter. Just enter your best email and click on “Sign up now.”