Preseason Big Ten college basketball rankings for 2015-16

Screen Shot 2015-07-02 at 9.28.53 AMLast month, I had the honor of being a guest on Assembly Call, the Indiana Hoosiers basketball podcast. Show runner Jerod Morris, Andy Bottoms and I discussed the upcoming college basketball season.

To prep for the show, I developed some preseason Big Ten rankings that combine calculations with subjective factors. It’s similar to what elite gamblers do in preparation for the season.

I wrote an article for Assembly Call that describes the methods and summarizes my thoughts on all 14 Big Ten teams.

My apologies in advance to Maryland fans.

And Bo Ryan’s announcement that he will retire after this season doesn’t change my mind on Wisconsin’s preseason rank, which you might find surprising.

To read the article on preseason Big Ten college basketball rankings for 2015-16, click here.

How the gambling markets view Tigers in the AL Central

Screen Shot 2015-06-30 at 3.47.14 PMIn my latest Detroit News article, I use a rich source of information to evaluate the AL Central: the gambling markets.

Each day, sports book set a moneyline for each game, and investors around the world put money on both teams. This collective wisdom of crowds drives a final price that should accurately reflect the strength of both teams.

Based on this market data, I calculate expected records for each team over the season. The results for the AL Central are surprising.

To read my article on how the markets view the AL Central, click here.

Accuracy of The Power Rank’s baseball predictions in 2015

worn-out_baseballOn April 23rd, 2015, I started posting a win probability for every MLB game. These predictions had two components:

From April 23rd to May 28th, 2015, the team with the higher win probability won 252 of 487 games for win percentage of 51.7%. The predictions did well for awhile, but they got weaker as the season progressed.

On May 29th, I changed the methodology behind the baseball predictions. Instead of actual run differential as an input to my ranking algorithm, I used an expected run differential according to the Base Runs formula.

For every team that hits 115 doubles and 58 home runs at this point in the season, some score 330 runs while others, like the Detroit Tigers, score 293. The higher scoring teams tend to cluster their hits due to better clutch hitting.

However, my research shows that good and bad cluster luck is not sustainable. Teams with extremes in cluster luck tend to regress to the level of run production given by the Base Runs formula.

With these new team rankings based on expected runs, the predictions have performed much better. From May 29th through June 21st, the team with the higher win probability has won 187 of 335 games for a win percentage of 55.8%.

For comparison, the team favored in the betting markets has won 181 of 335 games for a win percentage of 54.0%. The markets get half a win for offering the same odds on both teams.

The markets will catch up soon. They always do. However, this is a pretty accuracy for predictions that do not account for injuries.

Check the predictions page for daily updates to this record.

Are you fooled by the randomness of baseball?

ausmus_randomnessA version of this article appeared in the Detroit News on Monday, June 15th, 2015. The ideas on how humans view randomness apply far more widely than just Tigers baseball.

Do you remember April 20th? The Tigers had an 11-2 record and looked like an offensive juggernaut.

There were many ways to rationalize the success of the Tigers. Miguel Cabrera was finally injury free and back to his first ballot Hall of Famer form. Shane Greene allowed one earned run through 3 starts and looked like a Cy Young contender.

Now let’s fast forward to June 5th. The Tigers had just dropped their 8th straight game and had a 28-28 record.

There were also many explanations for the losing streak. Ian Kinsler was cold as ice, and the Tigers couldn’t get a clutch hit if the fate of the world depended on it.

It also didn’t help that 7 of these 8 losses came against the A’s and Angels, two underrated top 10 teams in my baseball rankings.

As fans, we find all kinds of reasons for winning and losing streaks. However, there’s an additional reason for these streaks not often talked about: randomness.

Let me show you.

Random flipping of a coin

In baseball, any team can win any game, even the Phillies over the Dodgers in 2015. Because of this uncertainty, it’s useful to have a probabilistic model for game outcomes.

We’ll do a simple experiment to see the relationship between the randomness of this model and streaks. Suppose a baseball team has a 50% chance to win each game.

To see how this average team’s season plays out, I flipped a 50-50 coin on my computer 162 times. The visual shows the results through 63 games.

randomness_patterns

Team average catches fire on game 19 of the season and rips off a streak of 10 straight wins. They lose a game but then win another 4 games in a row. Two of their young pitchers look like Cy Young contenders.

However, the bottom falls out at game 34. Team average can’t buy a clutch hit as they go 6-13 over the next 19 games. Fights break out in the clubhouse because of the lack of team chemistry.

Despite the losing streak, team average still has a strong 38-25 record after 63 games.

For the record, I made zero effort to find a random sequence with streaks. I generated this sequence once for an article I wrote a few years back. Randomness looks streaky.

The next visual shows the wins and losses for the Tigers through 63 games.

tigers_2015

Both visuals contain quite a few streaks.

What the Hot Hand paper says about streaks

To understand how humans view randomness, consider a famous paper by Amos Tversky, a Stanford professor. In 1985, he published a paper called The Hot Hand in Basketball: On the Misperception of Random Sequences.

In the study, Tversky asked participants to look at a sequences of X’s and O’s that represented made and missed shots in basketball. Some of these sequences were generated at random and looked like the my coin flipping visual. However, only 32% of participants called this random shooting, while 62% called it streak shooting.

People tend to see patterns or streaks in randomness, just like you most likely saw patterns in the random sequence above.

The authors of the study also generated sequences in which it was more like to get an O after an X and a X after an O. Only with this increased tendency for an alternating sequence did more people call the sequence random shooting.

Perceptions of randomness from a young age

I’ve also done my own experiment on perceptions of randomness. At Summers-Knoll, a project based school in Ann Arbor, I brought a bag with a white and black chess piece into the kindergarten class.

I told the kids they would take turns picking a piece from the bag, which replicated the random flipping of a coin I performed on my computer above. But first, I asked them what they expected the sequence to look like.

Most of these five and six year olds wrote down an alternating sequence of X’s and O’s. A few children had a sequence of two X’s in a row. We tend to think of patterns in randomness at a very young age.

When the children picked out a chess piece from the bag, they saw that the sequence looked quite different from their expectation.

Another interesting thing happened during the experiment. After a few pulls from the bag, the children started chanting for the black piece to get picked. Cheering for randomness? That’s exactly what sports fans do. It’s ingrained in us from a very young age.

What this means for Tigers fans

From June 9th through the 14th, the Tigers played 5 games against the Cubs and Indians. They alternated wins and losses as you can see in the final 5 games of the Tigers visual above.

As humans, we view this alternating sequence random. Win some, loss some, just hope the Tigers get to the 88 wins they need to win the division. (The coin flipping experiment I did above produced 91 wins in 162 games.)

This perception changes drastically when the Tigers win 11 of 13 or drop 8 in a row. Their play no longer seems random, and fans go in search of explanations for the streak.

It’s fine to rationalize the causes behind these streaks. Baseball is far more complicated than the flipping of a 50-50 coin. However, remember that randomness alone would generate these types of streaks.

Mailbag: Do bookmakers shade the under in MLB totals?

Thank you to everyone who submitted questions. You can read the first part of this mailbag here.

MLB totals for 2015

Why do you suppose the bookmakers shaded the unders in MLB for April 2015? I don’t follow baseball that closely, but there seems to be a lot of press about scoring being down and the games being too long. Does speeding up the game increase scoring?

Betting over on every game in April would have yielded +49 units in 2015.

Average team scoring is up (4.27 runs vs 4.21 runs) from April 2014, but the avg total line is down (7.63 vs 7.84).

— David Sone

Thanks for the analysis. I bet the bookmakers are a bit cautious about high numbers in April due to uncertainty in pitchers and the opposing offense.

I ran some numbers for May 1st through June 11th. This analysis considers the median closing total for each game.

The edge in taking the over is gone, as more games went under (285) than over (261). The market total nailed the total 26 times in 572 games.

The average market total is back to 7.86 during this period, while there have been 8.09 runs scored per game.

The best efficiency metric for college football

If you had to single out one certain variable that is most important for college football betting/predictions, what would you say?

— Lance Stone

There are a lot of choices for college football statistics, but I personally like yards per play for predicting college football games. This stat is incredibly easy to calculate and is mostly immune from the randomness of turnovers.

In college football, you need to be careful in breaking down this statistic into rushing and passing. On all major (and minor) media sites, sacks count as rushes even though the offense intended to pass. At The Power Rank, I count sacks as pass attempts in my yards per play rankings.

To make game predictions, I take yards per play and adjust for strength of schedule with my ranking algorithm. These rankings give one of the many predictions I use in the ensemble predictions available to my members.

There are other efficiency metrics such as expected points added and success rate useful for making college football predictions. I summarize these in my ultimate guide to college football analytics, which also discusses the randomness of turnovers.

What statistics matter most in picking a Super Bowl champion?

As a Super Bowl winner, in order, rank the aspects of teams that seem most key in determining Super Bowl champions: Passing, Rushing, Yards of Total Offense, Turnover Margin, Average Field Position, Penalties, Yards Allowed by Defense, Defense vs Run, and Defense vs Pass?

— Yoni Aharon, Member.

To determine the team with the most likely chance to win the Super Bowl, you need to find the best team. Hence, I came up with these rankings.

  • 1. Passing, Defense vs Pass. Sometimes cliches are true. The NFL is a quarterback’s league. This also implies that pass defense is important.
  • 2. Turnover margin, Average Field Position. These are clearly important, but teams have little control over these numbers. There is a wealth of research on the randomness of turnovers, while Bob Stoll has discussed how special teams performance in the past has little ability to predict future performance in the NFL. (I think I heard this on a Beating the Book podcast.)
  • 3. Yards of Total Offense, Yards Allowed by Defense. These are important because they reflect strength in passing and pass defense. It would be better to look at yards per play, but most NFL teams play at roughly the same pace.
  • 4. Rushing, Defense vs Run. There is little correlation between rush efficiency and winning in the NFL. This doesn’t imply that rushing doesn’t matter. It just matters much less than passing, which is why running backs no longer get the monster contracts.

I honestly don’t know about penalties. I imagine they don’t matter much.

Do defensive shifts in baseball work?

My question involves positional shifts in baseball. You see almost every team employing them on a pretty regular basis nowadays.

Many times a batter will hit right into the shift but I have also seen many instances where a double play grounder rolls right through a vacated infield spot. Pitchers then get very angry!

Is there a way for you to determine the success rate of defensive shifting? On the surface I think shifts give up just as many hits as they take away but I would like to get your take.

— Jim Winter

The data suggest that shifts work. This article claims that shifts saved 390 runs for all major league teams in 2014.

However, I think there’s a ton of randomness in these numbers from season to season. The table in the previous article suggest the Astros were great at saving hits with shifts while the Rays and Pirates were not.

However, all three of those teams have sophisticated analytics operations. The Rays inspired the Pirates, and the Pirates suddenly had a great defense in 2013. Check out the details in this article by Travis Sawchik. (I apologize for the annoying, unstoppable video ad.)

The randomness in predicting NFL and NBA games

Year after year- why is NFL scoring so unpredictable from one week to the next throughout each season. Maybe you have already done work related to this question and if so could you please direct me to a link?

— Chris Guy

Is predicting outcomes ATS (against the spread) most challenging in the NBA vs all other sports?

— Scott Shoultz

Predicting outcomes in professional sports is hard.

For the 2014 NFL season, 21 of 32 teams had a rating within 5 points of the league average in my team rankings. For the 2014-15 NBA, my team rankings had 22 of 30 teams within 5 points of the mean rating of 0.

This means that small events can change the number of points scored and tip the results of games. A dropped touchdown pass in football or a lay up that rims out in basketball can turn a winning team into a loser.

What’s the toughest sport to predict against the spread? I would guess the NFL just because it gets the most attention. However, that doesn’t mean the NBA is easy to bet.

How to construct an NBA team based on chemistry

One thing I find myself wanting to read more about is the analytics behind constructing a team. In the NBA we know shots around the rim and corner threes are the most efficient shots, but are there specific metrics to assess the synergy among players when compiling a roster, or should we take each player at face value based on their individual stats?

— Christopher Saik.

Team chemistry is certainly a holy grail for analytics.

This article looks at two papers presented at the Sloan Sports Analytics Conference. Both papers seem interesting but not a huge break through.

You can also look at the plus minus for a group of players on the floor. The teams probably have this data, although I can’t find a public source.

Team synergy is a tough one to get at with numbers, and that might always be the case. Sometimes, you just have to watch the games.

Tracking The Power Rank’s accuracy

What’s your record for ncaa football and pro football for the past 5 years?

— Anthony Cristiani

A full answer to this question is coming soon. I’ll go back and look at how the predictions I’ve posted have done. I’ll also back test the model I’ll use for the upcoming season.

On the predictions page, I’ve done a better job tracking my baseball results. From May 29 to June 11, 2015, the team with the higher win probability has won 105 of 188 games for a win percentage of 55.9%.