Things I should be working on:
- Class notes
- Class review questions
- Answering emails
Things I worked on instead:
- This freaking blog
So remember that baseball thing I’ve done a couple times now, once for 2016 and once for 2017? I did it again for 2018 because why the hell not.
Let’s do the CopyPaste Dance from a previous blog:
At the end of the regular baseball season, you can see how many wins each team got out of the total number of games they played, and then rank the teams by their performance (who had the most wins, the second most wins, etc.).
What I want to do is see how this “real” data correlates with how many wins each team would get if they scored their average number of runs per game in every single game they played. For example, if the Athletics score an average of 5.02 runs per game, how many of their games would they have won by scoring 5.02 runs in each of those games?
(Yes, I know you can’t score 0.02 runs in a single game, but just work with me here.)
- Record each team’s average runs per game (I’ll call this “RPG”) (from here).
- Sort teams from highest to lowest RPG.
- Now, if a team A has a higher RPG than team B, that would mean that A would win every game they play against B. So the next step was to figure this out for each pairing of Team A versus Team B.
- I used this logic for all pairings (numbers of games per pair was obtained from here), then summed across the rows to get the “predicted” number of wins based on RPG alone. Then I compared looked at how each team’s “predicted” number of wins compared with their “actual” number of wins, and ranked each team by both their “predicted” and “actual” values.
How do they compare for the 2018 season?
Boston (highest RPG) would win every game they played; Miami (lowest RPG) would lose every game they played. Bummer.
Correlation of RPG-predicted games won and actual games won: 0.797
Correlation of team rankings based on RPG-predicted games won and actual games won: 0.832
The biggest discrepancies, of course, are at the extremes. Based on RPG alone, Boston was predicted to win 66 more games than they did; Miami predicted to lose 66 more than they did. The smallest discrepancy is for the Angels, who were predicted to win two more games than they did.