This is a guest post from Nick Ceraso and Julian Frenkel, students at the University of Michigan.
How much should your team pay a free agent in baseball? Will your team strike the jackpot on a young, undervalued player or overpay for an aging star?
MLB free agency is particularly interesting, as baseball is the only one of the four major sports without a salary cap. Baseball’s offseason is an open market, with only a relatively small luxury tax for teams with the biggest payrolls.
Here, we take a data driven approach to predicting free agent salaries based on WAR, or Wins Above Replacement. This article discusses the method and looks at how these predictions performed for the 2015-2016 off season.
Finally, we look at the most interesting predictions for the 2016-2017 off season. You can check out all the results on this Google spreadsheet.
Regression model based on WAR
Using regression, we developed a linear and quadratic model for free agent salaries based on a player’s WAR for his three seasons prior to free agency. Regression provides the optimal coefficients for weighting each of these seasons.
The most precise blend of the three WARs for both models weights WAR for the last year very heavily, and the third year almost not at all. While this makes sense conceptually, it can cause our model to miss on some players.
We also attempted to measure the impact of player availability at each position on the market. By dividing individual players WAR by the total WAR available to the market for their position, we were able to gauge their relative strength on the market.
We then multiplied our WAR weighted average by 1 + ((Player WAR) / (Total Position WAR)), a term that gives an extra boost to high WAR players. This reduced the sum of squared error significantly and improved the accuracy of both models as a result.
The figure shows the results for the quadratic (nonlinear) and linear model.
The model is simple, and it doesn’t consider important factors that will affect a free agent contract, such as:
- a slow-developing market for a position
- a glaring need by a large market team
- an impatient owner who wants to win now
- age, as older players are often unwilling to take short term deals, and teams are unwilling to sign long term ones
However, we’ll see the model’s accuracy in predicting free agent contracts.
The simplicity of our model also contrasts it from the “value metric” of Fangraphs (pitchers and hitters). This method seems to place a lot of value on “market intangibles” or various factors that account for two players with equal productivity being paid differently.
Success and failure from 2015 free agency
After the 2015 season, we experienced great success in predicting some starting pitcher’s contracts. Let’s take a look at a few examples.
- John Lackey signed a two year, $32,000,000 deal with the Cubs. Model prediction: $16,000,000 per year
- Hisashi Iwakuma signed a 1 year, $12,000,000 deal with the Mariners. Model prediction: $11,925,600 per year
- Rich Hill signed a 1 year $6,000,000 deal with Oakland. Model prediction: $6,043,300 per year.
Not only were these all starting pitchers, but they were starting pitchers who were not the best in their free agent class (David Price, Zack Grienke) thus they were not subject to as many market intangibles. These three starters all had an above average season in 2015, but they are not a franchise building block.
On the other hand, one of the largest misses last year was 2B Daniel Murphy, who signed a three year, $37,500,000 contract with the Nationals. Our model predicted him to earn $4,510,000 based off of his performance.
However, above other market intangibles, Daniel Murphy changed his swing during the 2015 playoffs. This change helped him win the NLCS MVP and carry the Mets to the World Series.
Without accounting for his new swing (and therefore increased performance), our model vastly undershot his predicted salary on the open market. These cases seem few and far between, and we do not expect many cases like this in the future.
Predictions for free agency in 2016
Our model does well with two types of players: the late bloomers and the models of consistency. This section will look at examples of each as well as a player we don’t expect the model to predict that accurately.
You can find all the predictions on this Google spreadsheet.
Rich Hill is the ultimate late bloomer. After bouncing around the majors, Hill found himself in the Red Sox organization as a reclamation project. Looking at his WAR from 2014-2016, it seems like it worked, as he had a WAR of 0.2, 1.6, and then 4.1 the past three seasons.
In a unique case like this, it appears that his salary will be driven by his performance this year more than past years. We believe our model prediction of $16,540,000 is right about what he’ll end up taking home.
Another example of an ideal player for our model is third baseman Justin Turner, a model of consistency. Turner has been consistently good-to-great for the Dodgers, averaging a WAR of 4.33 since 2014. This past year, he fell right in line with that, being worth 4.9 wins.
With his 2016 performance being indicative of the type of player that he is, we believe his predicted salary of $20,000,000 will be an accurate prediction.
After defecting from Cuba and signing with the Oakland Athletics, Cespedes has enjoyed success during his time in the Majors. Looking at his WAR from the past three seasons, he was worth 4.1 wins in 2014, 6.3 in 2015, and 2.9 this past season.
As our model places a heavy emphasis on past year’s performance, his 2.9 WAR is the driving force behind his predicted salary. However, his talent level exceeds his 2016 WAR figure, and he will most likely be paid a higher salary than our model projects.
After two great years of contributing 4+ wins to his team, he will not be valued as heavily on his 2016 performance as the model suggests. With that in mind, we believe the model prediction of $12,390,000 is on the low side for Cespedes.