### Ah, it's close enough for government work

What correlates best with win%: The ratio of GF/GA or the goal differential?

To answer that question, I compared the correlations of each to the actual win% for every team over the last 15 seasons (not including '94-'95). The result was a dead heat: both had correlations of 0.97.

But what happens when you only look at the good clubs? Teams with a goal differential of at least +40? There were 83 teams that fit this criteria, and here are the resulting correlations to win%:

Ratio of GF/GA:.. 0.77

Pyth. W%:............ 0.76

Goal differential:. 0.68

Scoring 40 more than your opposition means more for lower-scoring teams than high-scoring teams. Think of it this way: Over a 20-game stretch, one team scores 40 and allows zero. Another team scores 140 and allows 100. Obviously both teams are better than average, but it should be obvious which will have the better record. It should also be obvious that the further you get from zero goals-against, the less that +40 means. 140GF and 100GA is better than 240GF and 200GA.

A real world example: Both the '02 Red Wings and the '92 Red Wings had goal differentials of +64. With 251GF and 187GA, the '02 Wings had a win% of .707. With 320GF and 256GA, the '92 Wings were .613. The Pyth. win% for each was .610 and .643, respectively.

My point here is to argue a beef I have with Alan Ryder's player contribution system.

Ryder says that the goal differential can provide a reasonable approximation of team success. It's also much easier to use than pythagorean methods. Ryder is correct, as illustrated by the equal correlations of GF/GA and goal differential with win%. The flaw with the differential is exposed when you diverge from average, as the correlations when goal diff. is +40 indicate. Ryder says this in his paper, but assumes the differential will be 'good enough' for most teams.

The problem with the Player Contribution is that it extends this linear team-level assumption to players. Teams, being aggregates of players, can't diverge as far from the norm as much as individual players can. The difference between the best and worst player will be wider than the difference between the best and worst teams. This amplifies the error in assuming the goal differential will be 'good enough' for analysis of individual players.

A player who contributes 40GF and 20GA is evaluated as equal to a player who contributes 20GF and *zero* GA. This is plain wrong. It is especially wrong if you're going to use this Player Contribution index to evaluate the best players in the league, who are the furthest from the norm.

It's certainly interesting to look at, and it's fun to come up with ways to compare players that go beyond the usual statistics. It also may be true that this goal differential may be a good enough estimate for most purposes. I'm just pointing out a caveat.

To answer that question, I compared the correlations of each to the actual win% for every team over the last 15 seasons (not including '94-'95). The result was a dead heat: both had correlations of 0.97.

But what happens when you only look at the good clubs? Teams with a goal differential of at least +40? There were 83 teams that fit this criteria, and here are the resulting correlations to win%:

Ratio of GF/GA:.. 0.77

Pyth. W%:............ 0.76

Goal differential:. 0.68

Scoring 40 more than your opposition means more for lower-scoring teams than high-scoring teams. Think of it this way: Over a 20-game stretch, one team scores 40 and allows zero. Another team scores 140 and allows 100. Obviously both teams are better than average, but it should be obvious which will have the better record. It should also be obvious that the further you get from zero goals-against, the less that +40 means. 140GF and 100GA is better than 240GF and 200GA.

A real world example: Both the '02 Red Wings and the '92 Red Wings had goal differentials of +64. With 251GF and 187GA, the '02 Wings had a win% of .707. With 320GF and 256GA, the '92 Wings were .613. The Pyth. win% for each was .610 and .643, respectively.

My point here is to argue a beef I have with Alan Ryder's player contribution system.

Ryder says that the goal differential can provide a reasonable approximation of team success. It's also much easier to use than pythagorean methods. Ryder is correct, as illustrated by the equal correlations of GF/GA and goal differential with win%. The flaw with the differential is exposed when you diverge from average, as the correlations when goal diff. is +40 indicate. Ryder says this in his paper, but assumes the differential will be 'good enough' for most teams.

The problem with the Player Contribution is that it extends this linear team-level assumption to players. Teams, being aggregates of players, can't diverge as far from the norm as much as individual players can. The difference between the best and worst player will be wider than the difference between the best and worst teams. This amplifies the error in assuming the goal differential will be 'good enough' for analysis of individual players.

A player who contributes 40GF and 20GA is evaluated as equal to a player who contributes 20GF and *zero* GA. This is plain wrong. It is especially wrong if you're going to use this Player Contribution index to evaluate the best players in the league, who are the furthest from the norm.

It's certainly interesting to look at, and it's fun to come up with ways to compare players that go beyond the usual statistics. It also may be true that this goal differential may be a good enough estimate for most purposes. I'm just pointing out a caveat.

## 3 Comments:

I was going to comment, but I don't think I can drink enough cups of coffee to say anything intelligent.

Good blog, sisu.

Not that I've made much of an effort to check into this, but isn't this "pythagorean" thing just an equation relating winning% to (GF/GA)^x ? Where "x" is solved by finding strongest sample correlation iteratively?

Frankly I've never really understood the thinking behind it, seems completely out of the blue to me. The value of "x" changes by the year as well it the blurb that someone linked me to as well. All just working backwards from the end result.

If so, then of course when you calculate the sample correlation coefficient on the result, you're reversing the process. And you should get a strong result. It would be impossible to NOT get that. On the outliers especially.

University math is a hell of a long ways behind me, but I thought I'd poke at this anyways sisu. :-)

The first reference to the PW% that I read was at the Hockey Project. I was very skeptical too, until I compared it to other measures of GF vs GA.

I've only ever seen it expressed as : 1/(1+(GF/GA)^2)

Never really thought about changing the exponent. Turns out you're right - there are other values of 'x' that correlate better to win%. But doing that spoils the most useful trait of the PW% - it's remarkably close to the win%. If you want to use win%, you can just swap it for the PW% without having to worry about OTLs or shootout points ot other nonsense.

Yeah, it's arbitrary. Lots of useful ideas have been discovered empirically. Take most of modern medicine, for instance.

Post a Comment

<< Home