First of all, hi everyone- I’ve been a lurker at EOTP for some time now, but only just recently made an account. Looking forward to lots of great discussions with everyone about both serious and silly things alike.
Recently, we’ve had the distinct misfortune as Habs fans of seeing a very promising team go down quickly in the postseason. Had we squeaked into the playoffs as a seventh or eighth seed with middling possession stats, a first-round departure would have been somewhat more expected. Yet, during the regular season we put up great numbers by more than a few measures. While many in the mainstream media attributed our resurgence as having to do with Team Toughness a la Brandon Prust, others knew better. We put up great possession numbers. We had great Fancy Stats. We were a Good Fenwick Team, and THAT was why we were playing well. (For a primer on Fenwick and other things, see the glossary). As Chris Boyle’s fantastic circular infographic showed pretty clearly, putting up Fenwick Close numbers is vital for making the postseason, and unlike previous years (2008, 2009, 2010) we actually had the possession numbers to justify a top seed in the playoffs. However, while Chris’s graph beautifully shows the difference between non-playoff teams and playoff teams, it isn’t quite as good at showing what happens once you make the playoffs.
I was curious to see to what extent having a good regular season Fenwick actually helps you once you reach the playoffs; in other words, whether there is a statistically significant relationship between RS Fenwick and playoff wins. I think it’s quite reasonable to start with a hypothesis that good regular season Fenwick numbers will predict playoff success. Of course, there are exceptions (Habs in 2010; San Jose in what feels like every year). It would be naïve to expect a strong, tightly predictive relationship between any metric from the regular season and postseason performance- injuries, reffing, coaching adjustments, ‘streakiness’ and just plain old luck have a disproportionate amount of influence in the playoffs (or, for that matter, any small sample of regular season games). But given how good Fenwick was at differentiating between playoff-bound teams and those heading to the golf course early, I thought that regular season Fenwick Close would show some ability to predict deep cup runs as well as early 1st round busts. So I ran a few quick calculations to see just how much predictive value we should place into our beloved fancy stats.
A quick primer on thinking about numbers in general- any time you look at the relationship between two variables, there are two main things you care about- the slope and the inherent variance. In this case, if we want to look at the relationship between RS Fenwick and playoff wins, there are four outcomes one could anticipate:
1) Fenwick predicts playoff wins, with low variance. This would indicate that a. good possession teams are more likely to go deep in the playoffs, and b. that they will reliably do so.
2) Fenwick predicts playoff wins, with high variance. This would indicate that a. good possession teams are more likely to go deep, but that this relationship is not reliable- i.e. there are many other confounding factors involved.
3) RS Fenwick either barely or does not predict playoff wins, with low variance. This would indicate that prior evidence of good puck possession is not important once the playoffs begin, with a relatively high degree of certainty. However, this is unlikely to ever occur in real life, because it would require that the distribution of Fenwick between all of the playoff teams in a given data set to be very small- and this is not what we see when analyzing real data.
4) RS Fenwick barely or does not predict playoff wins, with high variance. If this happens, all we can say is ‘there is no evidence supporting Regular Season Fenwick as a predictor of playoff performance’, but without much certainty. This is the likely alternative to outcome number 2.
My initial hypothesis was that the data would support outcome 2: that regular season Fenwick would predict postseason success, but that the relationship would be very noisy.
Using the numbers available on nhl.com and behindthenet.ca, I considered each team to make the playoffs since the ’07-08 season (not including the current one) and compared their Fenwick Close during the regular season to their number of playoff wins. As a comparison marker, I also looked at their goal differential during the regular season compared to their playoff wins. Since it appears that one full regular season is about the length of time it takes for goals to have the same or better evaluative power as shot-based metrics, this seems like an interesting comparison to make. I took two statistical approaches: the first was to divide all of the teams by which round of the playoffs they exited (1st, 2nd, conference finals, SC finals, or SC champions) for each year and use a one-way Analysis of Variance (ANOVA) test to see if regular season Fenwick Close (or goal differential) could significantly explain any of the variability seen in postseason outcome; the second was to compare regular season Fenwick and Goal Differential to the total number of postseason games won by Linear Regression, and look at r2 values as an indicator of how much postseason variance each regular season statistic could explain. Finally, I looked at the best-fit linear regression equations to see how many ‘expected playoff victories’ you get for each one point in RS Fenwick percentage above 0.50, and for each goal during the regular season over zero goal differential. Here is what the numbers show:
So what’s going on here? We can see in graphs 1a and 1b that there is an upward trend in both RS Fenwick Close and RS goal differential the farther a team advances in the playoffs. However, Fenwick only seemed to matter when differentiating those who won the Cup compared to everyone else; it was pretty much useless at differentiating between those who exited in the first three rounds. Regular season goal differential, on the other hand, increased a bit more smoothly and showed more variation between those who exited early and those who went the distance. The ANOVA test showed this statistically: running a one-way ANOVA on the Fenwick Close data showed that the chance of this distribution arising by chance is approximately 18%, whereas for the goal differential data this probability is only 6%, or very close to the p = .05 threshold set by most academics for statistical significance*. Graphs 1c and 1d show the same data, but separated by number of playoff wins instead of by playoff round exit. Linear regression shows that regular season Fenwick Close data explains 4.4% of the variance in playoff wins. Goal differential, on the other hand, explained 10% of the variance. Neither of these are very high predictive values, by regression standards. However, we can conclude that by these measures, the relationship between RS goal differential and playoff wins is about twice as ‘tight’ as the relationship between RS Fenwick Close and playoff wins.
Finally, let’s look at those two regression equations at the bottom. These equations do not indicate the variability within the relationship (you need to look at r2 for that), but they will tell you the slope of the relationship. First off, for Fenwick Close, the equation says:
-If you are an exactly even Fenwick team during the regular season (50.0), you will win approximately 5 playoff games on average.
-For every additional three regular season Fenwick points (i.e. 53.0 vs. 50.0), you can expect to win one additional playoff game**.
That is a very, very weak slope. This equation means that if Fenwick has any bearing on playoff outcome, then in order to have an ‘expected average win total’ of 8 or more, i.e. to expect making the conference finals, before luck, injuries and other random factors are accounted for, you would need a regular season Fenwick Close of 59.0 at the minimum, which is pretty rarified air (Detroit in 2008 territory). In fact, according to this relationship, the 2008 Red Wings are the only team in the last five years that would be expected to make the conference finals based on regular season Fenwick alone.
Now let’s compare this to the equation we get for goal differential:
-If you are an exactly even goal differential team during the regular season (+/- 0.0 goals/game), you can expect to win approximately 3 playoff games on average.
-For every additional +11 to your regular season goal differential, you can expect to win one additional playoff game.
This slope is also weak, but not as bad as it was for Fenwick. Looking at the same measure as before (how many teams in the last 5 years could be expected to make the conference finals based on goal differential alone, before accounting for luck and injuries), we get 8 teams: Detroit 2008, Boston and San Jose 2009, Washington and Chicago 2010, Vancouver 2011, Boston and Pittsburgh 2012.
So based on these numbers, Outcome 2 that we hypothesized earlier (Fenwick predicts playoff wins, but with high variance) doesn’t look so great- the data suggest more of an Outcome 4 situation (Fenwick barely or does not predict playoff wins at all, with high variance). We see a weak slope (three Fenwick points above 50 gets you only one additional playoff win), and a very high variance (r2 = .044). Interestingly, goal differential not only gave a slightly stronger slope, but also had less variance.
In order to consider the data from an alternate viewpoint, I converted each team’s raw Fenwick and goal differential numbers into ranks: that is, for each season, each of the 16 teams that made the playoffs were ranked from 1 to 16 for Fenwick Close and also from 1 to 16 for goal differential. I then ran the same tests as above, only using their season ranks instead of their raw numbers. My thinking here was that maybe once you make the playoffs, it might not matter what your absolute numbers during the regular season were, only how close you were to being the best (or worst). I also figured this might help manage some of the outliers present in the data set. Using this approach does not allow you to generate a meaningful regression line, but what you can do is compare the variance between this approach and the previous one to see if we learn anything***. Here is what came out:
First of all, note that ‘success’ here now decreases down the Y-axis instead of up (since ranking number 1 in Fenwick is better than ranking number 16), which is why the correlations initially appear to be in the opposite direction. But we can quickly see that using the regular season ranks instead of raw values does not help Fenwick’s case, it actually hurts it. In graph 2a, we can see a tiny downward trend near the cup finalists, but other than that, Fenwick ranks are essentially distributed randomly between exits at each round of the playoffs (the ANOVA calculates a 63% chance that this distribution could have occurred randomly). In contrast, the goal differential ranks predict playoff exit more nicely- the odds of this distribution happening by chance are approximately 3%. In graphs 2c and 2d, we see Fenwick and Goal Differential rank during the regular season compared to number of playoff wins, and get the same picture. While goal differential rank can predict the same amount of variance as do the raw values for goal differential (r2 = .10, or 10%), the Fenwick ranks can only explain 1 percent of the variance. The predictive value there is completely useless. We cannot generate a meaningful regression line from these data (because they are ranked), but visually looking at the slopes will show pretty clearly that once again, RS goal differential had a more meaningful slope than RS Fenwick Close.
So what does this mean? Well, the results certainly confirm the notion that playoff success is highly variable and quite random, compared to either regular season Fenwick or goal differential. However, although I had anticipated a lot of random noise in the data, I had not anticipated that the underlying slopes would be so weak (or in Fenwick’s case, mathematically pretty close to zero). This suggests (does not prove, but does suggest) that regular season puck possession is not only being overshadowed by other factors in the playoffs (which we have long suspected) but it may not even matter much at all.
An alternate view might be to say that no, possession-based statistics like Fenwick are still just as valuable as we thought, and the (very small) slope of the Fenwick-Playoff Wins relationship still indicates that puck possession is a valuable metric for evaluating team play- but playoffs are just so random that the ‘three Fenwick points for one playoff win’ relationship is the best we’re gonna get. The playoffs are a whole new season, and anything can happen. The recent article by Arik Parnass, and especially Ellen Etchingham’s article that he refers to, capture this view beautifully. As Ellen mentions, the hockey season (and that of other sports as well) is in many ways designed to ultimately reward a string of luck that may well be unsustainable but just has to happen at the exact right time. This would help explain why Fenwick was so good at predicting whether any given team will make the playoffs, but then immediately becomes crappy as soon as you get into best-of-sevens****. At the moment I’m leaning towards this view. But it’ll require some thought, and hopefully some more number crunching. Anyway, here’s to hoping that this prompts some good discussion.
-- Many thanks to Andrew and Arik for thoughts.
* P-values tend to be misunderstood in this sort of context. Consider it this way: if you were to run a computer program that simulated a very large number of NHL seasons and then repeatedly picked 80-team samples of playoff outcomes (as we have done here), you would expect to see the same outcome as we got (1-p)% of the time. So, if something is significant with p = .03, you would expect to get the same conclusion in 97% of all repeated simulations of the same outcome. In this particular case, we would expect to get the same outcome for the Fenwick data only 82% of the time (which is not statistically significant) and the same outcome for the goal differential data 94% of the time (which is almost significant, if you set your cutoff at 95% as most academics do.)
** ‘Expect’, of course, is used here in a probabilistic sense and not by any means in a deterministic one. It’s like flipping ten coins and saying ‘we expect five heads’, since we know a priori that the probability of any individual coin coming up as heads is 50%. Here, we can say the following: if the slope of the Fenwick/Playoff Wins relationship that we have observed is in fact true, then each 3 Fenwick points over 50 will yield one additional expected win. Note here that ‘the unpredictable nature of the playoffs’ should not change the slope of this relationship, since each team’s regular season Fenwick is known a priori when the playoffs start. Unless, of course, there is something fundamentally different about the way hockey is played in the playoffs than in the regular season (which is opening a significant can of worms indeed).
*** Those with training in statistics will notice that I’ve done two cringe-worthy things here, which are 1) to run an ANOVA on ranked data, and 2) to use r2 as a correlation coefficient for this same data set. Strictly speaking, a better way to look at this relationship would be to use a Kruskal-Wallis test instead of the ANOVA, and to use a nonparametric Spearman R statistic to look at the correlation. But for the purposes of this article, and to be able to more easily compare what we see here to the untransformed data, I’m cheating a little bit. Wherever you are, I ask for humble forgiveness.
**** I have to say- it’s a bit ironic that in this current shortened season, which seemed especially vulnerable to fundamentally weak but ‘lucky’ teams riding percentages and making unsustainable runs, three of the four remaining conference finalists were the three top regular season Fenwick teams. (This has not happened before in the data set I’ve analyzed, but almost did in 2011, when San Jose, Tampa Bay, and Vancouver were all conference finalists and top-four Fenwick finishers. However, none of them won in the end, due to the Stanley Cup sadly going un-awarded that year.)