Tag Archives: 2016 season

Baseball Stat Party Fun Time

A fun project!

At the end of the regular baseball season, you can see how many wins each team got out of the total number of games they played, and then rank the teams by their performance (who had the most wins, the second most wins, etc.).

What I want to do is see how this “real” data correlates with how many wins each team would get if they scored their average number of runs per game in every single game they played. For example, if the Mariners score an average of 4.74 runs per game, how many of their games would they have won by scoring 4.74 runs in each of those games?

The process:

  • Record each team’s average runs per game (I’ll call this “RPG”) (from here)
  • Sort teams from highest to lowest RPG

Now, if a team A has a higher RPG than team B, that would mean that A would win every game they play against B. So the next step was to make a grid like this and fill in the number of times each pair of teams played each other.


Boston has a higher RPG than the Rockies (5.42 and 5.22, respectively). So that means Boston would score 5.42 runs and the Mariners would score 5.22 runs in every game they played against each other. So of the 7 games played where these two teams faced each other, that would mean that Boston would win all of them.

I used this logic for all pairings (numbers of games per pair was obtained from here), then summed across the rows to get the “predicted” number of wins based on RPG alone.

How do they compare for the 2016 season?


Boston (highest RPG) would win every game they played; The Phillies (lowest RPG) and the Athletics (bad luck) would lose every game they played. Bummer.

Correlation of RPG-predicted games won and actual games won: 0.640

Correlation of team rankings based on RPG-predicted games won and actual games won: 0.683