Embracing numbers or: Why I learned to watch the game AND use statistics

Embracing numbers or: Why I learned to watch the game AND use statistics

317

Anyone that follows me on twitter or that has read any of my pieces here at MLHS knows that I enjoy using possession statistics alongside production statistics to examine and evaluate players. After recent events, like Lupul’s tweets and Alec’s interview with Greg Cronin, that have stirred up the tension between those that use these statistics and those that don’t, I thought I’d dig into why the use of statistics should be embraced.

A few common responses that I’m seeing to the use of statistics in general, but especially possession statistics, are “watch the game, statistics don’t tell you everything,” or “watching the game tells me everything I need to know” or even “statistics are basically magic.”

The first response isn’t incorrect. In fact, there are very few (if any) bloggers that would tell you statistics are all-encompassing measures of a player. Many bloggers (and fans) use a combination of video evidence and collected statistics to evaluate and compare players, teams and coaches. Disgraced journalist Jonah Lehrer wrote an article about the problems with sabermetrics in baseball (mainly that people are fixating on these numbers as the be-all and end-all) and Bill Petti wrote a critical response that has a great (and relevant) quote:

“… Lehrer’s main argument shouldn’t be that teams are assembling bad teams because of a narrow-minded focus on things they can quantify. The argument should be that teams that don’t think deeply about what are the right metrics and how much variance they account for in player achievement will fail just as much as those teams that used to generally ignore analytical approaches to the game.

Data and statistics are not to blame for bad decisions–their misapplication is.”

What I love about this quote is that it echoes something that I’ve said in response to Cronin’s interview with Alec: that just because an NHL team is using certain statistics to make decisions, doesn’t mean that method is the right way to proceed. In other words, I’m warning people about confusing what is and what should be.

The third response seems pretty silly to me and since there really aren’t that many people forwarding that view I’ll leave it at that.

The second response isn’t necessarily incorrect, but it’s highly unlikely and that’s really what I want to focus on in this piece.

Why is it highly unlikely that watching the game tells you everything you need to know? It’s a great question and there are multiple challenges.

Let’s start with the old adage “practice makes perfect,” and apply it to responses about how much hockey you’ve watched. Well, as it turns out, that adage should really go proper/perfect practice makes perfect.” The idea that I’m trying to convey here is that just watching the games isn’t enough to have you excel in understanding hockey. You’ll need to approach watching hockey in a systematic way that pushes you to your limits and that aims to improve your ability to understand hockey. This, in part, brings us back to the idea of needing know what we need to be focusing on and what events are important to understanding how to evaluate players, coaches and teams. Cue the next challenge.

Let’s say that you have a systematic approach to watching hockey and now you want to know what events you need to focus on. People are influenced by conspicuity (how eye catching something is), and so when deciding what to focus on, that’s likely where you’ll start. This is where you’ll run into the challenge of inattentional blindness. Inattentional blindness occurs when you don’t notice an unexpected event in your field of vision because you’re focusing on another event.

Why does this matter when watching hockey? If you’re focusing on the conspicuous events (possibilities: goals, hits, big saves) then you’re less likely to notice less conspicuous events (possibilities: set plays, zone exits/entries, pre-shot positioning) and you’re more likely to see what you expect to see (perhaps why Cronin thought he saw Grabovski turn the puck over in game 7). Inattentional blindness shows that even if the event is conspicuous, like a player entering the attacking zone, if you don’t know to look for that event and you’re focused on something else, like a hit that just happened, you’d likely miss the zone entry. Good news though: once you know to look for certain things they become easier to spot. The problem is that you can’t focus on everything.

Now, let’s assume that you have a systematic approach to watching hockey and you know what you want to focus on. After a game, or a season, is over and you go to evaluate a player (or coach or team), you first have to recall everything you’ve seen and despite your system and focus, your mind will play tricks on you. Most people think the brain works like a video camera that stores your memories perfectly, but the reality is that your memory is malleable, stored in pieces and events reconstructed every time you call on it. When you listen to the commentators describe the game, the way you remember the events they are talking about is influenced by the terminology used. For example, stronger phrasing, like “bulldozed” over “hit”, can lead to altered memories to support the phrasing (meaning you’ll remember the hit as bigger than it was).

On top of that, negative memories are more accurate than positive memories and studies lead to the suggestion that “individuals in a negative mood process information in an analytical and detailed fashion, whereas people in a positive mood rely on broader schematic or thematic information and ignore the details.” (Note: This is probably an important piece of information to keep in mind for Leafs fans as we head into next season after making the playoffs for the first time since 2004.)

Despite your system and your focus, not only is your memory unreliable and malleable but your mind is also subject to cognitive biases and heuristics (mental short-cuts that make problem solving simpler) that impact your judgment and decision-making. There are many heuristics and cognitive biases but I’m going to narrow the focus to just a few big ones (though I strongly encourage you to look at as many as you can).

Let’s take a look at availability heuristic first. This heuristic occurs when you assign the probability of an event occurring based on the ease with which you can recall a similar event to mind. So, you see a player turn the puck over (and because that may be a negative even you may be more likely to remember it) during the season and it lead to a goal against. When you try to recall how this player is defensively and that image jumps into your head, you’re more likely think that it’s a frequent occurrence even though it may not be. It’s worth noting that the images called to mind don’t have to be memories, imagined scenarios work just as well.

Next on the list is representativeness heuristic. This heuristic states that when slotting objects into multiple classes, the probability with which an object belongs to a class is dependent upon the degree to which that object is representative of the stereotype of that class. Very wordy, but a simple example is the stereotype of defensive-defensemen being big, mean, crease-clearing machines and then asking you whether a 5’10 defenseman that looks honest and plays cleanly is a defensive defenseman.

Confirmation bias is a tendency to favour information that supports your preconceived notion, irrespective of the validity of that information. This leads to the selective use of memory and information gathering skills. For example, you are more likely to remember or notice giveaways by a player that you think (rightly or wrongly) is a turnover machine.

The halo effect is another bias that can have a large impact on decision-making. This bias occurs when the perception of an element of an object spills over into other elements of that object. This can result in the evaluation of a player that exhibits generally likeable aspect to their game like blocking shots (*cough* Tim Brent *cough*) or physical play being more favourable than otherwise would be the case. The opposite bias (less favourable evaluation due to spillovers from a negative aspect) is the horn effect.

So, even if you are observing the game properly and know what to look for, your observation and interpretation of the game and the results are inherently biased and your memory unreliable. Hopefully, you are beginning to see the potential interaction between some of these phenomena and why they complicate observation, recollection and evaluation.

What’s the good news? Using statistics can help overcome some of these challenges (though they also present some different ones) by attempting to provide objective accounts of transpired events that can act as fact checking measures on what we perceive, observe and remember.

I’ll leave this monstrosity of an article (though there is a lot unsaid and I wish I could have touched on Bayesian thinking) with a question: is it statistics that need to be confirmed by observation, or the other way around?

266 comments
TML__fan
TML__fan

It certainly makes sense to use statistics to help verify/quantify what you're seeing (or not seeing) on the ice.  You have to somewhat foolish to ignore one or the other.  Unfortunately one of the most meaningful statistics is derived/inferred from a variety of other collected data.  I'm speaking of "quality scoring chances".  We nicely collect data such as shots on goal, blocked shots, even zones from which shots are taken.  But what we fail to quantify, other than through assumptions, what is a true scoring chance.  For example, a missed shot with a wide open net is treated as less of a scoring chance than a shot from the point through traffic, that happens to be deflected into the net off a skate.  Is anyone actually maintaining stats of "quality scoring chances".  Theoretically if a team does well statistically with puck possession and offensive zone time, they should do well with scoring chances.  But we still lack the detail to differentiate what % of the shots taken are "quality scoring chances". 

Dan39
Dan39

Taylor, what I don't understand is why you guys are so opposed to using scoring chances when you are so willing to use other stats. Cronin said that's the metric the Leafs use, why not at least try that for a season? 

If you counted the chances every game for the season surely the errors in subjectivity would balance out over the large sample size (e.g. ~20 x 82)?

If you listen to Cronin's interview, he said that after game 1 the Leafs won the possession game yet the 'advanced stats' you guys use suggest the opposite. If you use Cam's scoring chance count, Cronin is actually right. From game 2-7, the Leafs out chanced the Bruins.

If you extrapolate from that, i.e., that the Leafs generate more scoring chances than the 'advanced stats' suggest is possible, then surely the suggestion is that the Leafs may be able to defy the correlation of possession to winning next season. 

I think you guys who are advocating for a new approach should be willing to at least consider the possiblity that this approach works. After all, it's employed by top level professional coaches...

mORRganRielly
mORRganRielly

I just realized.  You can see me (or the back of my head) in the article's header picture.

EricWarren
EricWarren

Fantastic article and great lesson in perception that can easily be extended well beyond the scope of a hockey game! Well done!

Jordan29
Jordan29

Jack Edwards is the biggest idiot I ever seen in my life. I, watching some of his shit now. What an idiot

Xxxxxnew
Xxxxxnew

Again, with the stats, I appreciate the hard work put in on these but for the most part, particularly because they take such long and detailed explanation of how they are arrived at, they make me feel like I'm in a high school math class I'm sorry I signed up for. What interesting fact i actually learn isnt worth the reading time invested to find the tiny nugget. Hockey's not like baseball the ball is in play only an average of 7minutes a game and announcers need to fill 2 1/2 hours of dead air.

mORRganRielly
mORRganRielly

Really nice work, Taylor.  

I've gotten a few questions over the last month about my take on advanced statistics.  Cam Charron actually wanted to just call it "statistics" when I feel some who apply them exclusively to hockey blogs and articles do so disingenuously (i.e., they foist a concept with less than 35% correlation to winning on others because it's essentially a giant open-source project anyone can partake in).  

Basically, confirmation bias at its finest.

I actually asked @mlse whether NHL teams use Corsi or Fenwick and he refused to give me a straight answer.  Why drum up sarcasm and trolling if you don't know exactly what NHL teams track?  Baseball has supported Sabermetrics for more than a decade now -- on-base percentage, slugging percentage, and fielding are three guidelines in baseball's advanced statistics community.  I have observed (but not tested) that Sabermetrics has caused a rise in strike-outs, no-hitters, and relief pitching effectiveness.  

Clearly, advanced statistics CAN be useful when applied responsibly.  It took several decades before Sabermetrics and it's variants became mainstream.  This goes all the way back to the 1980s with the Oakland Athletics and Billy Martin.  

Several decades of work is what we have now in Sabermetrics.  And baseball is much better for it.

So overall, my dislike for the two main components of statistics on social media roots with the persistently negative personalities behind them.  In addition, I don't find them useful at the individual level.  I dislike any attempts to ignore coaching, system, and player decision-making -- otherwise, we'd have assumed that Grabovski is god and MacArthur is the most underrated player in hockey.  

That said, I find Corsi and Fenwick to be fascinating.  I don't think anyone can dispute that possession is a necessity in order to win games.  A shot has a chance of turning into a goal and many shots means more goals.  So there IS a correlation there.  You need to have the puck to score and you need to score to win hockey games.  

In the years I have been a hockey fan, I have always assumed that hitting and blocking shots means that the team is sacrificing body and showing heart to win hockey games.  Today, I'm learning that those two resultants and / or events might not correlate to winning at all -- since hitting and blocking shots means you don't have the puck, you can't shoot the puck.  Higher blocked shots and hits means you're consistently defending rather than attacking.  I have been working on a Reimer piece for awhile now and I strongly suspect his lower than league average save percentage from outside the home-plate area is a result of his team's persistent shot-blocking.

That said, Fenwick already accounts for blocked shots as a skill, so we're getting somewhere.  

Fenwick has shown that there's an exceptionally strong correlation to winning in the playoffs -- especially in the finals.  Technically, the correlation is roughly 55%, but I believe 9 of the last 10 teams that have made the finals had a top-five Fenwick Close.  So 90% correlation is pretty strong, while winning it all is roughly half that.  Either way, the correlation is there and we're better off knowing why possession matters at the playoff level. 

I care about learning.  I want to understand the activities and results that I can't observe in real-time.  I need to know why things are happening negatively or positively.  I believe traditional scouting is as integral to team success as following the statistic trail is -- if not more.  Scouting begets statistics, statistics begets scouting.  Both should be synchronized with each other -- otherwise, scouting can't function linearly with league trends.  

So while I dislike the common personalities that hide behind Corsi and Fenwick, I appreciate the work that comes with them because it means that there's context where context didn't exist before.  Without Corsi and Fenwick, we wouldn't have known that Phaneuf has taken on the league's STIFFEST competition for a defenseman -- all while producing points at an elite level.  Without Corsi and Fenwick, we wouldn't have known just how amazing Kulemin is as a shutdown winger.  Without Corsi and Fenwick, I wouldn't have laughed so hard watching Oiler fans freak out over Eberle's 'unexpected' regression -- he shot far above the league average's on-ice shooting percentage two seasons ago.  

Statistics in hockey has a long way to go.  There's no fluidity or isolation.  Events are happening several times a second.  It's almost impossible to compartmentalize events in hockey like in baseball where you can just focus on pitch by pitch.  

So I say treat Corsi and Fenwick lightly, but constantly monitor its growth.  That's how I'm doing it.  Although I think I'll always be a dick.

DeclanK
DeclanK moderator

I know it's pretty polarizing with regards to fancy stats around here versus other sites, which are completely focused on them, but it's a good idea to get your head around them now, because some of them will become common place and used frequently in telecasts, MSM reporting, etc. 

I like most of them and can be tremendously useful, I don't like some of the evangelists that are currently trying to use them to tout a superior knowledge of the game. They're here to stay, they will get more in depth and more refined. Out of all the major sports, hockey is a laggard with statistics due to the dynamic nature, the speed, the infinite amount of variables and the difficulty in which it is to play the game being much, much harder than all other sports (IMHO).

B_Leaf
B_Leaf

So I ask you Leafers:

Where do you think our team stands going into this season? Are we contenders? Are we favorites to reach the final four? Are we a dark horse? How do you guys view this team? 

Dan39
Dan39

@TML__fan No, that's not true. Cronin specifically said that the Leafs track scoring chances. The Leafs therefore _can_ track scoring chances. 

It's not suprising to me at all - I imagine they simply have the same man doing it based on the same criteria every game.

taylor_wright
taylor_wright

@Dan39 I'm not particularly opposed to using scoring chance data. I don't have that data to examine and am predisposed with a zone exit tracking project at the moment.

Based on the research that has currently been done though, scoring chances don't readily contribute a whole lot more than the much more easily available shot data. Not only that, but scoring chance data appears to have less stability on an annual basis. 

I'm not saying that the approach taken by the Leafs could never work out, just that it would be very unlikely to. There really hasn't been a team that has been able to consistently produce a high sh%, which is apparently what Cronin means to do through quality possession. If the Leafs do continue to have excellent luck via very high sh% and sv%, that does't mean that it's likely to continue and the available data suggests that it isn't. 

I would hope that management isn't so mypoic that their strategy is to bank of riding high (and very likely unsustainable) percentages to another playoff birth. However, recent work suggests that management from all teams is more likely to make roster decisions and ice time allocations based on current sh%, despite it's instability (which could be driven purely by luck). 

As an aside, by Cam's scoring chance data, the Leafs had more chances than Boston in games 2, 4, and 7 while tying game 6. 

Again, I'm not against scoring chance data but as of right now it doesn't look like a panacea for this team or for hockey analytics. 

Uncle Otis
Uncle Otis

@Jordan29 

You didn't know about him eh?

Never called a fight a Bruin lost either

ingy56
ingy56

@Jordan29 He actually compared Campbell's broken leg to the soldiers on D-Day. 

Dan39
Dan39

@ingy56  

Heh I just saw Gunnar walking down Bay Street, he looked stoned/out of it like he normally does. And the other thing: he's too damned skinny. My prediction is he just gets injured again.

NotInsane
NotInsane

@mORRganRielly @mlse 

This is a very long but well thought out post.  I'm not sure where to begin.   You're right to observe them because more information can only help.  I don't think Corsi and Fenwick were required to know Phaneuf played against the toughest opposition.  His ATOI would hint toward that and so would counting the minutes he plays against the oppositions best players.  

I worry that these shot stats might work for teams that play a "shoot, shoot shoot" style but that's not how the Leafs play.  So I wonder if you can compare these across differnt coaching styles

DeclanK
DeclanK moderator

@Jordan29 HAHA - I downloaded the entire series from NESN. SOOOOO funny to hear how much of a homer he is. Bowen would be the same, I suppose.

Xxxxxnew
Xxxxxnew

I appreciate the hard work put in on these but for the most part, particularly because they take such long and

detailed explanation of how they are arrived at, they make me feel like I'm in a high school math class I'm sorry I signed up for. The level of dissection needed to get to the point would kill most broadcasts.

Uncle Otis
Uncle Otis

@DeclanK

And your last sentence explains why the hipster geek inspired movement to cram them down our throats is folly..

You wont see too many NHL coaches make line-up decisions based on them,but the stat geeks know better.

lalalalalala

B_Leaf
B_Leaf

@DeclanK 

Hockey leaves so much open to interpretation. Each person interprets a hit differently. And the value of a hit is different depending on several factors. Same can be said for scoring chances...a shot from 25 feet is a scoring chance but one of them the goalie has a clear look and is set, another has JVR parked in front.  So much with hockey is more art than science. 

NotInsane
NotInsane

@DeclanK 

Depends on the stat, DeclanK.  Defensive zone faceoff start percentage is useful and enjoy using that to better understand how a player is utilized by his coach.  Corsi and Fenwick I dont see becoming all the rage.  they're like offensive zone possession time.  You rarely see that outside of NHL13

LeafsForLife
LeafsForLife

@DeclanK Evangelists? I would add in Fanatics! Add fanatical to the list of reasons i wont try to grasp them.

Xxxxxnew
Xxxxxnew

If everything falls into place I'd say final four isn't unrealistic. So much goes into it besides geeing a decent team. Injuries. Goaltending. Conditioning. Breaks. Getting on a roll. Home ice.

wendelsway1
wendelsway1

@B_Leaf I think our additions make us a tougher team to play against and we will give any team a go on any given night.  I see us as a playoff team and once you're in anything can happen.  :)

Mind Bomb
Mind Bomb

@B_Leaf  Dude plan the parade, we are winning the Cup this year !


LeafsForLife
LeafsForLife

@B_Leaf I think we need to wait till near the beginning of the season to really call that more accurately as there may still be trades etc. Not to mention we could have some real surprizes in camp that jump out at us. But i will say B_Leaf, the time is long overdue to take the bag off your head.

TML__fan
TML__fan

@Dan39 @TML__fan Yes, I know many teams track the scoring chances that they get and that they take against them.  What I'm saying is that sites that maintain data and generate statistics, generally lack any data/stats on "quality scoring chances".  They think of it as noise.  I'm not convinced it is noise, nor do many NHL teams who track these stats themselves.

Dan39
Dan39

It will certainly be interesting to watch this season. Do you know if other teams have attempted the 'quality possession' / advanced stats possession-poor strategy that the Leafs seem to be employing?

NotInsane
NotInsane

@mORRganRielly 

Everyone is moving to the new post.  YOu should re-post this... but yes its long!  haha, but I read through it.

Xxxxxnew
Xxxxxnew

Also, a scoring chance from 40 feet is relevant to Kessel because he can actually score from there. A scoring chance from 40 feet by Kulie probably isn't a scoring chance at all.

DeclanK
DeclanK moderator

@NotInsane @DeclanK Absolutely. I think some are only useful for navel gazing exercises. Others are fantastic.

LeafsForLife
LeafsForLife

Yes i can see what yours saying but i dont think im changing my mind. Its more fun to talk Leafs than it is to do so much math. Dont get me wrong, i still read what was written but i didnt try to grasp the concept. Dont care for it. Probably never will.

DeclanK
DeclanK moderator

@LeafsForLife @DeclanK Not all. Some do a fantastic job of not being patronizing and actually talking about the game rather than just talking about the numbers. 

LeafsForLife
LeafsForLife

@DeclanK err grasp the statistics rather than grasping the fanatics. Just to be clear!

wendelsway1
wendelsway1

@Xxxxxnew Absolutely....far too many variables....but if we can stay healthy, we have a chance to make some noise and can/will give any team a go on any given night  :)

B_Leaf
B_Leaf

@Mind Bomb @B_Leaf 

I hope you're right bomber. I do feel like we have a chance, but I gotta see how we mesh first.

Jordan29
Jordan29

@LeafsForLife @B_Leaf 100% contenders. I made so many bets with people about it too I have so much confidence. Every year including last I expected nothing but this is it. Long playoff run Im hoping

B_Leaf
B_Leaf

@LeafsForLife @B_Leaf 

If you saw my face...

But yes I agree, it is really hard to know. I really thought we might come back with the same team add Clarkson minus MacA, Liles, and Komi. 

TML__fan
TML__fan

@Dan39 @TML__fan True.  It is somewhat ironic that those same people will explain how a goalie's performance (SV%) made a team look better or worse that what Corsi or Fenwick would infer, yet what type of scoring chances the goalie faced is considered "noise".

Dan39
Dan39

@TML__fan @Dan39 Yeah, that's the issue for the 'advanced stats' bloggers - they don't have the data so they use Corsi and Fenwick. The issue is, at least in the Bruins series, the two approaches give suggest opposite outcomes, ie the Leafs were outpossessed using Corsi and the Leafs 'outpossessed' the Bruins by scoring chances post Game 1. I personally think the pros are probably right and that scoring chances can diverge from Corsi for long periods as it did with the Leafs Game 2-7 and probably for the whole of last season.

ingy56
ingy56

@wiski @ingy56 Just have Fraser to get done before arbitration next Tuesday. Then Kadri and Franson.

mORRganRielly
mORRganRielly

@NotInsane @mORRganRielly I'd do it in the other thread, but I don't want to derail it.  Thanks for your comment though!  Now I feel much better having typed all that with a response lol

LeafsForLife
LeafsForLife

Yes baseball i grew up checking for RBI's, HR's, etc etc but i wont get into talking or trying to understand all the stats stuff theyre doing now. Soon they will offer University courses for it.

Xxxxxnew
Xxxxxnew

And that's because the ball is in play an average of about 7 minutes in any baseball game and you really need something to fill in the other 2 1/2 hours during a game.

NotInsane
NotInsane

@LeafsForLife 

and that's fine.  You don't need them to enjoy the game.  Baseball on the other hand has been swamped with stats.  It was always there but its become ... overwhelming. 

Xxxxxnew
Xxxxxnew

Get to the playoffs and then it's a game at a time.

B_Leaf
B_Leaf

@wendelsway1 @B_Leaf 

I think we are a team on the rise...not sure how much...but I see no reason why we can't compete with the best.

wendelsway1
wendelsway1

@B_Leaf @wendelsway1 Yep....like the mix of players we now have....oh hell, I'm with MB....we're winning it all this season.....that's my story and I'm sticking with it until proven otherwise....GO LEAFS GO!!!   Not a fanbase out there that is more deserving  :)

Mind Bomb
Mind Bomb

@B_Leaf @Mind Bomb  I like our chances, My biggest concern bro, is depth and Defense, I would have liked to upgrade the D, but it looks like we wait for it to happen from with in which I believe in 2 to 3 years we will have one of the top D's in the league. 

 Now our D as no Legit NHL depth, at least top 4, we have a plethora of 6D's.  Our Second pairing has an RFA, and one that is still on an ELC. Everyone has already moved liles, well say Gunnar has a set back, or gets hurt a new ? 

 Having said all that, The potential is there for sure to win it all. IF we stay healthy, and build on last years performance, we can win it all.