Introduction to Advanced Statistics

Introduction to Advanced Statistics

82

I have been invited to write some posts for Maple Leafs Hotstove on advanced statistics / hockey analytics and how they relate to the Maple Leafs.

I figured a good start would be to write up a brief (or maybe not-so brief) introduction to hockey analytics, which is something I have been intending to do on my own website HockeyAnalysis.com but haven’t gotten around to it yet. It will get more attention here anyway, so this is as good a time as any to get it written.

This is certainly not an exhaustive list of everything going on in hockey analytics, but it should be a good overview of most of the major terms and concepts.

Individual vs On Ice vs Team Statistics

Before we get into specific advanced statistics,  let me mention the overriding concept of individual statistics vs on-ice statistics vs team statistics. Individual statistics are exactly what they sound like: Statistics an individual puts up. These are goals, assists, points and shots that the individual player produces. Phil Kessel scored 37 goals last year, 27 of which occurred during five on five even strength play, are examples of individual statistics. While these are called ‘individual statistics,’ they can still be heavily influenced by one’s teammates and an individual player can also influence his teammates’ individual statistics. As a result, it is often better to look at on-ice statistics.

On-ice advanced statistics are what the player and his team mates produce when the player is on the ice. When Phil Kessel was on the ice during five on five even strength play the Leafs scored 68 goals for a goals for rate of 1.013 goals for every 20 minutes of ice time — an example of on-ice statistics. Both individual and on-ice statistics can be used for individual player evaluation.

Team advanced statistics are just what they sound like: Statistics related to team performance.  “The Leafs scored 147 goals during five on five even strength play last season,” is an example of team statistics. Team statistics are, as you may have guessed, best used in team evaluation.

Possession Statistics: Corsi & Fenwick

This is probably the most commonly used concept in hockey analytics and yet often the most controversial. In many respects possession, Corsi and Fenwick have almost become synonymous with “hockey analytics,” although there is a lot more to hockey analytics than just these two metrics.

Possession is essentially defined as how much a team possesses or controls the puck. It could be represented as time (i.e. The Maple Leafs possessed the puck for 28:37 in last night’s game) or as a percentage (i.e. the Maple Leafs possessed the puck during 45.8% of the play in last night’s game). The idea behind possession: The more you control the puck, the more opportunity you have to generate scoring chances as well as less opportunity for your opponents. This is a good thing and something teams should want to do.

Unfortunately, the NHL doesn’t track possession time, which is where Corsi and Fenwick come in. Corsi and Fenwick are shot-based metrics. Corsi considers shots + shots that missed the net + shots that were blocked. Fenwick is the same but does not consider blocked shots. People generally use Corsi and Fenwick as a proxy for possession or puck control. Corsi can be presented as a counting stat (i.e. the Maple Leafs had 52 Corsi events for last night), but is more commonly represented as a percentage. If the Maple Leafs had a 52% Corsi percentage it would mean — of all the Corsi events that took place last night by either team — 52% of them were taken by the Leafs. Corsi percentage is often shortened to Corsi% or, as I tend to frequently use, CF% (corsi-for percentage). Fenwick percentage is the same but without considering shots that were blocked. In theory one could also just look at shots (ignoring both shots that missed the net and shots that were blocked), but doing so is far less common.

These statistics can also be described as rate statistics split between offense and defense. For example, I use CF20 (or CF/20) as indicating Corsi For per 20 minutes of ice time. CA20 would be corsi against per 20 minutes of ice time. CF/60 and CA/60 are also commonly used as indication per 60 minutes of ice time.

Although some people have preferences for Corsi over Fenwick or vice versa depending on use, I for the most part consider them interchangeable as they are extremely-highly correlated. For the most part I consider them to measure the same thing and using one over the other is  unimportant. That said, whenever I talk about whether I can/could drop one of them from my stats database, there is generally a group of people that want to continue to see both be made available.

Note: You may also see Corsi/Fenwick referred to as shot attempts, which is becoming a more user friendly and intuitive way of describing them.

Shot Quality, the Percentages (Save and Shooting) and PDO

One of the issues many (predominantly non-hockey analytic supporters, but myself to some extent as well) have with Corsi and Fenwick is that they are measuring shot attempts and not the quality of the shot attempt. There have been countless debates over this and to what extent shot quality exists and its relative importance. It is unfortunate, really, because neither side is absolutely right.

Let me first define the notion of shot quality. For me, showing that shot quality is real and is significant starts and ends with showing that a player or a team has the ability to maintain elevated shooting percentages. If a team year in and year out can maintain an elevated shooting percentage, shot quality exists. If a player year in and year out can maintain an elevated shooting percentage, shot quality exists.  We know that some shots are more difficult than others (i.e. a rebound shot from 8′ is far more difficult than a point shot from 45′), but what we want to know is whether a team or player can have a higher quality shot on average. Having shots that are, on average, more difficult to save and thus have a higher chance of resulting in a goal is essentially the definition of shot quality.  Now, does this exist at either the player or team level?

Let’s start by looking at players. Over the past 7 seasons, the players with the highest on-ice shooting percentage (i.e. the shooting percentage of all shots taken while the player was on the ice) during 5 on 5 play (minimum 4000 minutes of ice time) are Sidney Crosby, Steven Stamkos, Alex Tanguay, Marian Gaborik and Marty St. Louis, all with an on-ice shooting percentage above 10.2%. The five worst players, all with an on-ice shooting percentage below 6%, are Travis Moen, Nate Thompson, Samuel Pahlsson, Shawn Thornton, and Craig Adams. To not believe in shot quality at the player level one must believe that there is little or no difference between those two groups and they have achieved their elevated (or suppressed) shooting percentages by luck (good or bad) alone. If anyone believes that they are denying reality. Furthermore, if anyone believes that the difference between shooting 10% and shooting 6% is not significant they are denying reality as well (shooting 10% over 6% means scoring 66% more goals on the same number of shots). Shot quality exists and is an important consideration in player evaluation.

At the team level, shot quality is a little more difficult to show because it has generally been difficult for teams to assemble a group of players that can drive shooting percentage up and down the line up. High shooting percentage players are difficult to acquire and it would be cost prohibitive to assemble a full team of high shooting percentage players (in part because NHL teams have generally paid more for them). That said, there some teams that have shown an ability to maintain elevated or suppressed shooting percentages. The Leafs are generally one of those teams as they have finished 6th, 1st, 7th, and 5th in shooting percentage during 5 on 5 play over the past 4 seasons. Conversely, San Jose has finished 19th, 26th, 26th and 25th over the past four seasons.  The differences at the team level are less significant than at the player level, though, and thus Corsi is more effective as a team evaluation tool. The analytics bear this out.

It is my belief that players have an ability to influence their teams save percentage, although I do believe it is much more difficult to quantify this effect. Since any given year a player only plays in front of a couple goalies, it is extremely difficult to decouple the player’s impact on save percentage from their goalie’s. That said, I believe the ability is there, although less so than the ability to drive shooting percentage. I’ll get into this further later on when I discuss score effects.

PDO is an interesting statistic; it is essentially on-ice save percentage plus on-ice shooting percentage. Across the league the mean would be 100%, but individual teams and players can fluctuate a little from that point. Some people use PDO as an indication of luck or good/bad fortune by looking at how much PDO deviates from 100%, but one must take into consideration the quality of goaltending the player plays in front of or the players’ ability to drive on-ice shooting percentage. A PDO of 102% does not necessarily mean the player is lucky. Gaborik’s PDO over the last 7 seasons is 103.1%, while Crosby’s is 102.8%. So, while PDO can provide some indication of good/bad fortune, one must still consider to what extent the player’s talent or the circumstances play in as a factor.

Sample Size

This is probably a good time to bring up the issue of sample size because it is an integral reason why people would choose to use Corsi or Fenwick over goals. I just told you that shot quality is real right after telling you that possession/Corsi and Fenwick are important and valuable tools in analytics. Here is the issue: Goals are a relatively infrequent event in hockey. A team will score typically 2-4 goals per game and 200-250 goals per season. They will take between 25 and 35 shots per game and 2200 to 2800 shots per season, and they can have nearly twice as many Corsi events per game. These differences have a major impact in how confident we can be in the conclusions we can make and that has an impact on how we conduct analytics. Let me explain.

Since goals are so rare, a lucky bounce or two or a “hot streak” can have a huge impact on the results of a statistical analysis. If next season Matt Frattin starts the season with 5 goals in the first 8 games (he actually had 6 goals in his first 8 games he played in the 2012-13 season), we don’t immediately conclude he has a chance to lead the league at the end the season with 50 goals. We don’t do this because we know odd things happen over small sample sizes and there is no evidence Frattin has that kind of ability. As sample size grows, that “hot streak” will get averaged out by a “cold streak,” and eventually Frattin’s statistics will begin to become more representative of his actual ability. Over the next 12 game he may go on a cold streak and only get 1 goal giving him 6 in his first 20 games, which would put him on a 24 goal pace — still likely very high for Frattin, but much closer to what one might expect than the 50 goal pace he was on early on. It might take another 20 or 30 games or maybe the whole season for Frattin’s statistics to become representative of his true ability.

The significantly greater number of Corsi events that occur mean that we generate large sample sizes far more quickly and we get a better representation of talent far more quickly. With 20 games of data, goals are a very poor predictor of future performance but Corsi or Fenwick are far better predictors. This is true at both the team and player level. The greater number of events means we can draw conclusions far more quickly, which is a significant reason why people use Corsi and Fenwick.

Corsi/Fenwick vs Goals

To summarize the sections above we have the following:

  • Corsi/Fenwick have larger sample sizes and thus “stabilize” closer to true talent levels far faster than goals.
  • Shooting percentages do vary significantly across players (and to a lesser extent teams) and players likely have some impact on save percentage. As a result of this, Corsi/Fenwick will never be able to truly represent a players (or to a lesser extent teams) true offensive or defensive value (true value should always measure in terms of ability to boost goals for and suppress goals against because that is what truly matters in hockey).

As explained above, the people that tend to rally against analytics tend to do so on the idea that not all shots are created equal. The analytics people who fight back tend to argue that, at the team level, possession metrics like Corsi and Fenwick are the better predictor of future performance and thus it is fair to use Corsi and Fenwick as a primary talent evaluator (even if it doesn’t tell the whole story). Both sides have cases to be made, but as with most disputes the truth is somewhere in the middle. A team or (especially) a player evaluation that doesn’t include some consideration for the percentages is an incomplete and possibly incorrect evaluation and it is vitally important to be aware of this. Conversely, a team or player evaluation based largely on goal-based statistics that doesn’t include some consideration for sample size related errors and uncertainty is an incomplete and possibly incorrect evaluation (and we need to be aware of this, too).

Game State and Score Effects

To state the obvious, teams score goals at a significantly higher rate on the power play than they do at even strength and they score goals at a significantly higher rate at even strength than when killing penalties. Typically, hockey analytics is conducted using five on five even strength statistics unless one is conducting an analysis power play or penalty kill play specifically. Unless otherwise specified, advanced statistics such as Corsi or Fenwick or even goals and goal rates when conducting advanced statistical analysis are five on five even strength statistics and usually exclude goalie pulled situations (Stats.hockeyanalysis.com and ExtraSkater.com exclude goalie pulled situations, while I believe they are included at behindthenet.ca, which can lead to important differences).

The score of the game can have a significant impact on a teams and players statistics. Generally speaking, a team that has a lead gives up shots at a higher rate than they do in other situations but also has a higher save percentage, indication the shots they give up are of lower quality. Last season, only four teams had a Corsi percentage above 50% when leading (Los Angeles at 53.2%, Chicago at 50.7%, New Jersey at 50.1% and Boston at 50.0%). Conversely, only three teams (Maple Leafs at 48.0%, Buffalo at 47.7% and Edmonton at 47.5%) had corsi percentages below 50% when trailing. Last season, the Maple Leafs shooting percentage when leading was 9.03%, but it was 7.87% when trailing. A lot of the time score effects aren’t important, but for some occasions and for some teams they can have an impact on a team’s overall 5v5 statistics; therefore, at times they should be taken into consideration. For an analysis of how score effects can impact a players performance, have a look at my analysis of Dion Phaneuf when protecting a lead vs when playing catch up hockey.

I mentioned score effects in the section above in reference to a players ability to impact his teams save percentage. Score effects are evidence of this. Teams and individual players have a worse on-ice save percentage when playing catch up hockey than when protecting a lead. This can only happen if players have the ability to influence save percentage. The theory goes — when players play more aggressive offensive hockey when trying to play catch up, they give up more odd-man rushes against resulting in higher quality shots against and a lower save percentage. The opposite is true when a team plays more conservative defensive hockey when protecting a lead. This, to me, is clear evidence that players can and do influence save percentage at least based on style of play, if not by talent differences.

Usage – Zone Starts, QoC and QoT

Zone starts describe what zone a player starts his shifts in (not including on-the-fly line changes) and refer to the percentage of face offs the player was on the ice for in the offensive, defensive and neutral zones. These can be represented by Ozone%, which is the percentage of offensive or defensive face offs that the player had in the offensive zone (i.e. Ozone% = Offensive zone starts / offensive+defensive zone starts). My preference is to also consider neutral zone starts by looking at each zone separately. I do this by looking at OZFO% for percentage of face offs in the offensive zone, DZFO% for percentage of face offs in the defensive zone, and NZFO% for percentage of face offs in the neutral zone.

There has been much investigation into the impact of zone starts on a players individual statistics. Early research found that zone starts had a significant impact on a players overall statistics. While this sentiment is still floating around, it has largely been dismissed and most are now accepting that zone starts have minimal impact for most players overall statistics. Even the most extreme zone start usage will at most have a 1-2% impact on Corsi% (i.e. a 52% Corsi player with extreme offensive zone start usage is almost certainly still a 50+% Corsi player under neutral/average zone start usage). For most players it is not a significant factor in on-ice performance.

Like zone starts, Quality of Competition (QoC) is largely overstated when it comes to the impact it has on a players overall statistics. While a player playing against Sidney Crosby will have worse statistics than when playing against a typical third or fourth liner, the reality is that there are no players so consistently playing against high end players (or low end players) that their statistics will be impacted in a significant way.

The reality is zone starts and QoC metrics are of minimal importance in player evaluation  and are best used solely as an indication of how his coach views his skill set.

Conversely, Quality of Teammates can have a significant impact on a players statistics. David Clarkson in 2012-13 had a Corsi% of 61.1%. He dropped to 42.3% last season. He did get significantly more defensive zone starts, but a greater statistical analysis indicates that the main reason for this massive drop in Corsi% is the quality of his team mates. He went from playing on a very good Corsi team with some very good line mates (Patrick Elias and Travis Zajac) to a very poor Corsi team and playing on the second or third line (with Mason Raymond, Nazem Kadri, Jay McClement and Nikolai Kulemin). By far the only usage statistic that really needs to be taken under significant consideration in player evaluation is quality of teammates.

WOWY

That brings us to what I consider the most important concept in hockey analytics: WOWY’s. WOWY stands for With Or Without You and looks at how players perform when playing on the ice together and when playing apart from each other. The value of WOWY’s is they tell us who is the more important player and who is making who better. Nothing shows this better than looking at Tyler Bozak’s statistics. Due to the much smaller sample sizes when looking at WOWY’s, I’ll look at the last three seasons of Bozak with and without Phil Kessel.

  • Bozak when playing with Kessel had a 46.8 CF% and 51.9 GF%.
  • Bozak when not playing with Kessel had a 32.9 CF% and 38.2 GF%
  • Kessel when not playing with Bozak had a 46.1 CF% and 50.0 GF%

In short, Bozak was terrible when not playing with Kessel while Kessel performed about the same when not playing with Bozak. This is clear evidence that Bozak was dependent on Kessel (along with Lupul and/or van Riemsdyk) and not the other way around.

WOWY’s help show who the production drivers are and who are not. To me, that answers the most important question in hockey. You want players who drive results, not those that depend on others to drive results. As an example of WOWY analysis, have a look at my analysis of the Hartnell/Umberger trade.

IPP, IGP and IAP

IPP stands for individual points percentage and is calculated by dividing the number of points a player has produced by the number of goals that were scored while the player was on the ice. This statistic tells us who is most involved in the teams offensive production when they are on the ice. Bringing this back to Kessel and Bozak, Kessel has an IPP of 78.9% over the last three seasons compared to Bozak’s 60.7%. This means — of all the goals that Kessel was on the ice for over the past 3 seasons (during 5 on 5 play) — he had a point on 78.9% of them, which is definitely among the league leaders. Conversely, Bozak had a point on only 60.7% of all the goals scored while he was on the ice which is near the bottom of the league. Like WOWY’s, IPP can help us determine which players are integral to their teams offense when they are on the ice and which players are more bystanders when it comes to offensive production.

IGP stands for individual goals percentage and is calculated almost exactly the same as IPP, but instead of using the points the player has we use the goals the player has scored. Phil Kessel has an IGP of 36.8%, which means he has scored 36.8% of all the goals scored when he was on the ice. This is also among the league leaders, although well below Stamkos’ 46.3% IGP.

IAP is the same as IGP except that it uses assists instead of goals and can be used to identify play makers rather than goal scorers. Henrik Sedin leads the league in IAP with an IAP of 61.4% over the past three seasons. Joe Thornton is right near the top as well with an IAP of 59.0%. These are probably the two best pure play makers in the league the last several seasons (they have very low IGP’s of 14.3% and 16.7%, respectively).

Looking Forward

There are a lot of other things happening in the world of hockey analytics, from projects tracking zone entries and exits to some of my recent work on “rush shots.” I hope to explore more of these in my future posts. There is still a lot to learn and explore in hockey analytics and I plan on using this as an outlet for bringing some of it to my fellow Leaf fans. For the time being, I hope that the above acts as a good introduction to hockey analytics and a general description of where hockey analytics is at right now.

I do want my experience here at MapleLeafsHotStove.com to be an interesting and informative one for everyone involved, and that means I want it to be a bi-directional experience. I want to hear from you and want to know what interests you in the area of hockey analytics and how they relate to the Leafs. Whether you are an avid supporter of hockey analytics or a skeptic, I look forward to your feedback. If you have any questions you want answered, or players you want analyzed or want anything clarified, let me know. I have some ideas for future posts, but I am certainly leaving it open for input from all of you as to what I write about; definitely let me know either in the comments below or via e-mail. I will definitely read all of your comments but cannot guarantee that I will respond to them all. I will do my best as time permits and some responses may be in the form of a future post as opposed to an immediate and direct reply.

Hockey analytics can be interesting and informative and that is what I am hoping I can bring to MapleLeafsHotStove.com.

—–

For more information on specific statistics and terminology, you can have a look at my glossary at stats.hockeyanalysis.com or the glossary at ExtraSkater.com. A lot of the terminology and statistics and how the statistics are calculated are consistent between these two sites, which makes things easier though each site offers some statistics that the other does not.

66 comments
Kevin Flynn
Kevin Flynn

"Even the most extreme zone start usage will at most have a 1-2% impact on Corsi% "

The analysis presented in the linked article which is meant to support this point seems to have a serious flaw.
"It should be noted that corsi rates are about 7.5% higher during the f45 play (goal rates are ~15% higher!) so I will reduce the f45 corsi rates by 7.5% to account for this and conduct a fair comparison". Isn't this the whole point? If Corsi rates are 7.5% higher within the 45 seconds right after a face-off, this would seem to SUPPORT the notion that players Corsi stats for players taking excess defensive or offensive zones will be skewed in the direction of that particular zone. Instead David Johnson just adjusts the numbers to account for the skewing, and the claims, surprise, surprise, that zone starts have no impact on Corsi! This really needs an explanation.

Logan in TO
Logan in TO

Thanks alot for the intro, this summer has had many posts trying to explain the basics, but this one was perfect. i feel like since theres no standard its hard to remember what people use in their analyses. im probably gonna need to read this article like 10 more times to really get to understanding this better. looking forward to your future posts. 

Hmmm
Hmmm

Nice work and very interesting indeed, but like many of us, I have some questions.

Given your WOWY numbers for A, B, and C, do we have stats for D?

A) Bozak when playing with Kessel had a 46.8 CF% and 51.9 GF%.

B) Bozak when not playing with Kessel had a 32.9 CF% and 38.2 GF%

C) Kessel when not playing with Bozak had a 46.1 CF% and 50.0 GF%

D) Kessel when playing with Bozak had a ? CF% and ? GF%

Reason I ask is I don't see how we can safely conclude that Kessel helps Bozak with just those numbers.


When Bozak was not playing with Kessel, who were his line mates? If he was centering a defensive line say between two 4th liners, it could really drag his numbers down.  If Kadri (one player) was filling in for Bozak, it might not drag Kessel down as much.  It also implies that Kadri is worse than Bozak.

Does this stat take into account for the possibility of synergy, the sum being greater than the parts? For example, putting a line of the 3 best shooters in the NHL might be worse than 2 shooters with one good passer.

Matt Wall
Matt Wall

I would really enjoy if you did more posts featuring individual players or line pairings where you find something interesting or if something stands out.


I'm still getting into advanced stats so the more you can simplify it, the better.


Great post! Thanks!

WendelGilmour
WendelGilmour

Great job, David. Well explained, keep it up. Great to see much of what I see verified statistically. This should be required reading for those who still state Kadri needs to step up.

Also affirms my stance that PP2 is misnamed, it is far more dangerous and productive. PP1 should be last years "PP2" plus Kessel, with Kadri running it, IPP at 94%. Kessel, Kadri, Lupul, Rielly, Franson.


Dubs first task will be to get RC to look at these and actually use the information.

Jay31
Jay31

Fantastic article here, look forward to your future posts.

Burtonboy
Burtonboy

The Raptors have one of the best analytics departments in the NBA . Considered to be out in front of most other teams in the NBA . Its headed up by a guy by the name of Alex Rucker who is actually a consultant but working full time for the Raps. I can see the Leafs heading in this direction rather then say hire someone like Dellow . Not to diminish what Dellow does but the Leafs have experts in the field right in the same building. 


http://sports.nationalpost.com/2012/09/22/unplugged-raptors-analytics-consultant-alex-rucker-on-advanced-statistics-part-1/

StanSmith
StanSmith

I'm confused. If Corsi and Fenwick are tracking shot totals for and against, isn't that exactly the information you get, is shot totals, not possession? I understand that you have to possess the puck before shooting it but in some cases you only have to possess it for a split second.  A player, or a team can possess the puck for long lengths of time and never even attempt a shot on net.  In terms of possession I would rather see the actual possession times. Despite the fact that the NHL doesn't track it pretty much every game is televised and can be recorded so the only thing it takes to accumulate pure possession times is the time to do it. Even more importantly, at least team-wise I would like to see zone time stats, regardless of who actually possesses the puck in the zone.

TML__fan
TML__fan

I'm pleased you didn't skim over the debate about shot quality, as this is a topic that sometimes gets downplayed when looking at analytics.  It's unfortunate that shot quality is not tracked in a better manner and statistics readily available.  Although shooting averages may help to justify the importance of shot quality, they fail to fully quantify the influence of an individual player, and certainly when it comes to on-ice influence at both ends of the rink.   Even if a player consistently has an above average SH% (and their on-ice shot quality is high), that could be offset if their on-ice shot quality against is even higher.

Similarly we generally measure a goaltender's contribution by their save percentage, with little or no attempt to consider shot quality.  Yet we often marginalize the numbers based on the number of shots they face.  Can we assume all goaltenders face the same percentage of shot quality regardless of the number of shots?

Certainly hockey analytics are useful, but until they are somehow weighted by shot quality, they are only showing part of the picture. 

Dan39
Dan39

Hi David, thanks for the article. Can you address this question, might have gotten lost below?


Re Bozak, don't the CF% and IPP% just show that he's the one on the line responsible for defense? With IPP%, it's obvious what I mean. With CF%, it seems to me his with-numbers may indicate more defensive assignments and the without% perhaps when Kessel is deployed on a rush or prolonged offensive zone period to add more offense.

vinoa
vinoa

So Ph9's zone starts at QoC had nothing to do with his performance down the stretch? I guess there goes that defense.

Edit: Great article, but I've come to expect no less from MLHS. A bit of a long read, could have probably been a 2 parter. I'll have to come back to this one again soon enough.

Armchair GM
Armchair GM

A bit of information overload but I'll keep it as a reference.  What comes to mind is the Russian hockey coach Anatoly Tarasov and his approach to posession and shots - shoot only when you have a good opportunity to score... a very high scoring percentage will deflate an opponent's confidence...

MaxwellHowe
MaxwellHowe

Thanks for this.  I was in a "discussion" yesterday about Kadri and Lupul's performance when playing together.  Is there a WOWY available for that combo?

deedrag
deedrag

Thank you for your time explaining this David. I've been curious about Advanced Stats for awhile and how teams might use them. From these stats, I'm seeing that it is easier to build a team around a good defence first. Seems there are more variables when trying to create scoring than when trying to prevent it. Like the wowy stat. This can answer a lot of questions regarding who really is worth keeping on a team and who isn't. Curious to know Kadri's wowy? I bet it's good.

BigTO
BigTO

Welcome, David. I've checked your site for years, and find these numbers very interesting.

I re-read this intro to AS just because everyone can use a refresher course. 

It's been interesting to see the League follow in lockstep once a few teams decide there's value in having experts in these numbers on staff. It's been a big summer for hockey's numbers game.

I'm always left wondering.. It would render much of the Corsi and Fenwick proxies as a thing of the past and we can stop debating over them.. Why can't the NHL just begin recording time of possession? The EA NHL game does it FFS. Corsi and Fenwick correlate nicely to it when its been tracked by bloggers but knowing definitively who has the puck most of the time by the second would put an end to so much of the debate. No one will argue with a stat like that nor does it leave any room for misinterpretation really.

hockeyanalysis
hockeyanalysis

@Kevin Flynn I can see how you can see that as being an issue and it probably wasn't my most ideal post to link to. But, with that said, I want to test if a player who gets significantly more offensive zone starts than defensive zone starts (or vice versa) sees their statistics impacted due to that imbalance and that imbalance alone. To do so while looking at FF20 and FA20 I have to make sure I am comparing apples to apples. If I didn't do this the numbers in the FF20 and FA20 columns would be, on average, 7.5% higher.


Unadjusted %Difference in FF20 = 1- F45FF20/5v5FF20


Adjusted %Difference in FF20 = 1 - F45FF20*0.925/5v5FF20


Values in the FF% column should be about the same.

hockeyanalysis
hockeyanalysis

@Hmmm As Freshmintsauce said, CF% and GF% are "on-ice" stats and represent what the team does when a player is on the ice. Thus when Bozak and Kessel are on the ice together their stats are the same so A=D.

Your comments about Bozak apart from Kessel and Kessel with Kadri could be true. More digging would be necessary to determine that. I may do this in a future post.

You could use WOWY analysis to attempt to investigate synergy. There are definitely lots of examples where two players both put up better stats when together than when apart which could be a sign of synergy.


sepatown
sepatown

@Hmmm Bozak when playing with Kessel and Kessel playing with Bozak are the same thing :) 

hockeyanalysis
hockeyanalysis

@Bobsyouruncle 

1. Nothing is changing. All your other favourite articles will still get written and you can still talk and read about hockey as you wish.

2. Nobody is forcing you or anyone to read these new analytics articles. Don't worry, I won't be offended.

3. Nothing against the PPP people but I think you will find my analytics posts a little different and hopefully more balanced. Many of my analytics posts int he past have been heavily criticized, if not ostracized, by the PPP crowd. I hope that you will read my posts with an open mind, but if you don't that is fine too.

DeclanK
DeclanK moderator

@Bobsyouruncle

We've been posting about advanced statistics for 2-3 years. We included Fenwick Close in game reviews all year. You haven't been paying attention.

hockeyanalysis
hockeyanalysis

@StanSmith Yes, technically you are correct. Some people have done a comparison between actual possession time (watching games using a stop watch as you suggest) and shot totals and found good correlation. So, in addition to telling you shot attempts, fenwick/corsi can act as a proxy for possession time. It'll never be a perfect replacement but can act as a proxy.


In all honesty though, I am not convinced that if we had possession time it would be more useful than corsi/fenwick. The primary objective in hockey is to out score the opposition. Corsi/fenwick measures shots which is really one step back from scoring goals (i.e. you need a shot before you can have a goal). Possession time may be one more step back (i.e. you need the puck before you can get a shot which must happen before you can get a goal) which may in fact be a worse metric. I don't think anyone has studied this to be sure either way though.


hockeyanalysis
hockeyanalysis

@Dan39 

IPP is a purely offensive stat and is his point production relative to the number of goals scored when he is on the ice. Getting more defensive minutes playing with guys with poor offensive abilities won't hurt his IPP. In fact it might help him as any offense generated is more likely to go through him than his defensive-minded line mates (i.e. McClemment) than his offensive minded line mates (i.e. Kessel).


Bozak's CF% could get hurt though by playing defensive minutes particularly when he is put out solely for a defensive face off. If he loses the face off he faces the likelihood of a shot against. If he wins the face off the team clears the zone and he goes to the bench. He was only there to win the face off. In this circumstance he is unfairly punished for being put out in a defense only situation. This would definitely have an impact in his "away from Kessel" stats though probably not enough to account for the difference.

It is arguable whether Bozak is on the line to be the one guy responsible for defense. It may be the case but one would also need to evaluate whether he is effective in that role. There is certainly some doubt as to his effectiveness (exploring this might be well suited to a future post).


MaxwellHowe
MaxwellHowe

I just looked at Stats.HockeyAnalysis.com.  The way I read it, Kadri is slightly better without Lupul.  Lupul is a little better when he is with Kadri.  They have a CF% of 44.2 together.  Kadri has a 46.6 without Lupul, Lupul has a  40.6 without Kadri.  

wiski
wiski

@Gilbey93 They are testing a few data collection systems this year sports vu and sportsvision 

Tim Horton
Tim Horton

@Gilbey93 Most of the advanced stats stem from a combination of pretty simple stats like time on ice, shots or missed shots. 

hockeyanalysis
hockeyanalysis

@deedrag I'll do some WOWY analysis if Leaf players in future posts but yes, Kadri generally does pretty well especially relative to other Leaf centers.

hockeyanalysis
hockeyanalysis

@dlb Mad There has been difficulty in identifying shot location as a significant contributing factor to shot quality though it may be a minor factor (i.e. it can't be reliably shown that some players have an ability to consistently have a higher percentage of shots taken from more dangerous areas when they are on the ice).


For me, I think more significant factors are the quality of the shooter and the quality of the play makers that set up the shot. Stamkos has a good deceptive and accurate shot and that is a significant factor in his shooting percentage. He also plays with really skilled teammates who can set him up perfectly for those shots. The quality of the shooters and the puck skills of the guys on the ice drive shooting percentage more than ability to generate more shots from dangerous areas.

billtech
billtech

@hockeyanalysis my concern would also be who was Bozak playing with? My other concern point would be sample size...how often has Bozak not been playing with Kessel? And did the team play a different 'game' when Bozak was out? My take on WOWY..who does the coach put out. 

Hmmm
Hmmm

@sepatown  No what was Kessel's CF% and GF% not Bozak's when playing together.

StanSmith
StanSmith

@hockeyanalysis @Bobsyouruncle Just reading this article tells me you have a different take on analytics than PPP.  They talk as if their numbers are absolutes and if you disagree with them you are a moron.  You obviously feel they are open to interpretation and discussion.

StanSmith
StanSmith

@hockeyanalysis @StanSmith The goal of the game is to score goals while preventing the other team from scoring.  In terms of possession as long as you have it, regardless of how many shots you get, the other team doesn't.  Similarly with zone time, regardless of who possesses the puck.

hockeyanalysis
hockeyanalysis

@CanuckUKinToronto @StanSmith This will be the future. It's just a matter of when. Could be in the next year or two as this is already being done in basketball and several companies are trying to sell the technology to the NHL.

Dan39
Dan39

@hockeyanalysis @Dan39 No, I'm saying on a line with Kessel and JVR, if his responsibility was defense for the line, presumably his IPP% would be expected to be lower than that of his team mates. Does that make sense? 


As to the WOWYs, I think you explained why his CF% would be lower without Kessel and I explained why Kessel's would be higher without him. Of course, Kessel is also a great player. 


I don't know for certain, it just seems like you can explain at least part of why Bozak's advanced stats are so terrible in these ways. I don't know to what extent, though. 


maximus_asinus
maximus_asinus

@MaxwellHowe but what does that mean? I interpret that as Kadri is more likely to pass off to Lupul when they're together, and when they're not together Lupul is "looked off" because there is a better shooter (Kessel) on the ice.

JWaterdrager
JWaterdrager

Shots are tracked by those guys too. I could see a difference there too.

Dan39
Dan39

@hockeyanalysis @deedrag Re Bozak, don't the CF% and IPP% just show that he's the one on the line responsible for defense? With IPP%, it's obvious what I mean. With CF%, it seems to me his with-numbers may indicate more defensive assignments and the without% perhaps when Kessel is deployed on a rush or prolonged offensive zone period to add more offense.

hockeyanalysis
hockeyanalysis

@dlb Mad Both Stamkos and Kessel are high end players but it is probably not too difficult to show that Stamkos is better. I also believe that teams are far better with an elite center than an elite winger which I think knocks Kessel down a notch when discussing overall value. Center is just a more important position overall.

hockeyanalysis
hockeyanalysis

@billtech Your concerns are valid and a complete player analysis would look at all these factors. For example if you were evaluating Bozak you wouldn't just look at how he does with and apart from Kessel and Kessel's stats with and apart from Bozak. You'd also want to look at Bozak's stats with and apart from every McClement and McClement's stats with and apart from Bozak as well as every other player Bozak ahs played with. You start looking for trends. Does Bozak consistently make his teammates better or worse or are there some that perform better and others that perform worse. 


I'll do some of these player evaluation analyses in the future so you can see how one might go about doing it but unfortunately hockey analytics is hard. There is no one statistic that provides all the answers.

hockeyanalysis
hockeyanalysis

@StanSmith @hockeyanalysis Right. In that sense possession is definitely more of a defensive stat. If you can hold on to the puck and never give it up, you will never give up a goal.


The nature of offense though it requires you give up the puck. You are required to shoot the puck which opens the door for the other team to regain control.


hockeyanalysis
hockeyanalysis

@Dan39 

"No, I'm saying on a line with Kessel and JVR, if his responsibility was defense for the line, presumably his IPP% would be expected to be lower than that of his team mates. Does that make sense? "

Yes, that could be the case. If he were holding back and playing defense first. If this were the case one would expect Kessel's defensive stats improve with Bozak than without. I am not sure this is the case. I could look into it in a future post.

Maxpower417
Maxpower417

@JWaterdrager Shot attempts (whether they miss, are blocked or not), who is on the ice and where faceoff's start are some of the easiest things to track and not surprisingly some of the most reliable and robust statistics from arena to arena.

vinoa
vinoa

@hockeyanalysis Is Stamkos a traditional center though? Isn't he more of a winger if he's waiting for the set up?

ToasterStrudel
ToasterStrudel

@hockeyanalysis @Dan39

Bozak's the defensive conscience of his line in the ozone, and the main guy likely to take dzone faceoffs at crucial points in the game, most likely without his normal wingers. 

I know this is old so no need to reply - just figuring stuff out.



Burtonboy
Burtonboy

@hockeyanalysis @Burtonboy I heard about Sports Vu well over a yr. ago and have kinda kept an eye on it since thinking it might show up in the Hockey world in due time. Read a couple of articles about what the Raps were doing with this technology and found it to be extremely interesting.