‘Sup, y’all?
So once again it’s the off season and I need some sort of baseball in my life, so let’s do that thing I’ve done the past several years where I analyze a team’s average runs per game and use that to determine how many games they would have won if they’d scored that average in every game.
Here’s the copy/paste explanation:
At the end of the regular baseball season, you can see how many wins each team got out of the total number of games they played, and then rank the teams by their performance (who had the most wins, the second most wins, etc.).
What I want to do is see how this “real” data correlates with how many wins each team would get if they scored their average number of runs per game in every single game they played. For example, if the Dodgers score an average of 5.05 runs per game, how many of their games would they have won by scoring 5.05 runs in each of those games?
(Pretend you can score a fraction of runs in a single game for the sake of all this.)
The process:
- Record each team’s average runs per game (I’ll call this “RPG”) (from here).
- Sort teams from highest to lowest RPG.
- Now, if a team A has a higher RPG than team B, that would mean that A would win every game they play against B. So the next step was to figure this out for each pairing of Team A versus Team B.
- I used this logic for all pairings, then summed across the rows to get the “predicted” number of wins based on RPG alone. I ranked the teams according to how many predicted number of wins they’d get (“1” meaning the most, “30” meaning the least). Then I compared looked at how each team’s “predicted” ranking compared with their “actual” ranking (“Diff”).
Here’s how they compare for the 2021 season:

The Astros (highest RPG, “Predicted” rank of 1) would have won every game they played; the Rangers and the Pirates both would have lost every game they played (lowest RPGs, “Predicted” ranks tied at 29).
The “Diff” column is calculated by taking the predicted rank minus the actual rank. Positive “Diff” numbers suggest teams did better than they would have had they scored their average number of runs in every single game. Negative “Diff” numbers suggest teams did worse than they would have had they scored their average number of runs in every single game. The Reds did much worse in real life than they would have if they’d scored their average number of runs in each game (“Diff” of -10); the Mariners did much better in real life than they would have if they’d scored their average number of runs in each game (“Diff” of +13).
Correlation of team rankings based on RPG-predicted games won and actual games won: 0.723
SUPAH COOL!
