On April 23rd, 2015, I started posting a win probability for every MLB game. These predictions had two components:
- team rankings that take run differential and adjust for strength of schedule
- ensemble preseason rankings
From April 23rd to May 28th, 2015, the team with the higher win probability won 252 of 487 games for win percentage of 51.7%. The predictions did well for awhile, but they got weaker as the season progressed.
On May 29th, I changed the methodology behind the baseball predictions. Instead of actual run differential as an input to my ranking algorithm, I used an expected run differential according to the Base Runs formula.
For every team that hits 115 doubles and 58 home runs at this point in the season, some score 330 runs while others, like the Detroit Tigers, score 293. The higher scoring teams tend to cluster their hits due to better clutch hitting.
However, my research shows that good and bad cluster luck is not sustainable. Teams with extremes in cluster luck tend to regress to the level of run production given by the Base Runs formula.
With these new team rankings based on expected runs, the predictions have performed much better. From May 29th through June 21st, the team with the higher win probability has won 187 of 335 games for a win percentage of 55.8%.
For comparison, the team favored in the betting markets has won 181 of 335 games for a win percentage of 54.0%. The markets get half a win for offering the same odds on both teams.
The markets will catch up soon. They always do. However, this is a pretty accuracy for predictions that do not account for injuries.
Check the predictions page for daily updates to this record.
[…] and I’m making a full effort to track and report on all of my predictions. It started with baseball this spring, and it will continue through football and […]