HELLO NERDS!
So I’m in major baseball withdrawal and there’s not really an easy way to fix it, but let’s pretend we can by doing that little baseball analysis thingy that I’ve done for the 2016, 2017, and 2018 seasons, but now let’s do it for 2019!
A reminder of what this analysis is, since it’s been more than a year (I think):
At the end of the regular baseball season, you can see how many wins each team got out of the total number of games they played, and then rank the teams by their performance (who had the most wins, the second most wins, etc.).
What I want to do is see how this “real” data correlates with how many wins each team would get if they scored their average number of runs per game in every single game they played. For example, if the Phillies score an average of 4.78 runs per game, how many of their games would they have won by scoring 4.78 runs in each of those games?
(Yes, I know you can’t score a fraction of runs in a single game, but just pretend you can, huh?)
The process:
- Record each team’s average runs per game (I’ll call this “RPG”) (from here).
- Sort teams from highest to lowest RPG.
- Now, if a team A has a higher RPG than team B, that would mean that A would win every game they play against B. So the next step was to figure this out for each pairing of Team A versus Team B.
- I used this logic for all pairings, then summed across the rows to get the “predicted” number of wins based on RPG alone. I ranked the teams according to how many predicted number of wins they’d get (“1” meaning the most, “30” meaning the least). Then I compared looked at how each team’s “predicted” ranking compared with their “actual” ranking.
How do they compare for the 2019 season?

The Yankees (highest RPG, “Predicted” rank of 1) would have won every game they played; Miami and Detroit both would have lost every game they played (lowest RPGs, “Predicted” ranks tied at 29). Poor Colorado did a lot worse in real life than they would have scoring their average number of runs per game. Coors Field effect, maybe?
The “Diff” column is calculated by taking the predicted rank minus the actual rank. Positive “Diff” numbers suggest teams did better than they would have had they scored their average number of runs in every single game. Negative “Diff” numbers suggest teams did worse than they would have had they scored their average number of runs in every single game.
Correlation of team rankings based on RPG-predicted games won and actual games won: 0.811
FUN!