Metrics for GSW vs. OKC Game 6 Second Half

Continuing with the live metrics employed yesterday, here is an analysis of the second half of the Warriors-Thunder Game 6. 

Here is a plot of the various time series of relevant statistical variables: 

One can see from this plot for example, the exact point in time when OKC loses control of the game. 

Further, here are the correlation coefficients of the variables above: 

One sees there is a tremendously strong anti-correlation between OKC’s lead and GSW 3PT%, while there is a somewhat strong correlation between OKC’s lead and their 2PT%. This perhaps means that for Game 7, OKC’s 3PT defense needs to greatly improve along with maintaining their 2PT%, which, as can be seen from the plot above, dropped off towards the end of the game. 


Live Metrics for NBA Games

Yesterday for the first time, I took the playoff game between Cleveland and Toronto as an opportunity to test out a script I wrote in R that keeps track of key statistics during a game in real time (well, every 30 seconds). Based on previous work, it is evident that championship-calibre teams are the ones that have excellent 2PT-FG% and the ability to draw fouls, so I tracked these during the game, and I came up with the following plot of several time series:

One sees for example that while Toronto started off the game with a much higher 2PT FG%, towards the end Cleveland ended up winning that battle.

A video of this animation is as follows (set the YouTube player to 1080p + FullScreen for Max Quality!)

An interesting question to ask is how are these series correlated? Well, let’s see:

In this correlation plot, “pd” indicates point difference, “PF” indicates personal fouls, “2PFG.” indicates 2-Point field goal percentage.

One sees immediately from the correlation plot above that there is a very strong correlation between Cleveland’s point difference  and Toronto’s personal fouls, with some strong correlations attributed to Cleveland’s 2-Point FG% as well.  The equal and opposite is true for Toronto’s point difference. It seems that during a game of this intensity in the playoffs, drawing fouls is a very important factor in determining which team leads and eventually wins in the game combined with 2-Point field goal percentage.

The Three-Point Shot Myth Continued…

I’ve been ranting a lot about the so-called “value” of the three-point shot in “modern-day” basketball. I know! But, here is yet one more entry.

The common consensus is that teams are shooting more three point shots as discussed in the articles below:


There are several more where these have come from. My issue is that on one hand these analyses seem grossly oversimplified. Second, none of the analyses have looked at a per-team trend. From my observations of these articles, they are just looking at total number of three point shots taken/made every year over the past number of seasons.

Indeed, the standard approach is to look at the league averages from the past number of years, and note that the average number of three point shots and attempts has increased (well almost) year-to-year, but this is not entirely useful.

What one should do is look at the probability that any team attempts / makes more than a given number of three point shots per game in a given season. Below, we use a kernel density method to calculate these probabilities.  One approach is to calculate the mean number and standard deviation of the number of three-point shots attempted and made per season for each of the previous sixteen seasons. These will generate time-dependent functions \mu(t) and \sigma(t).

One can in principle then solve a Fokker-Planck equation to obtain a time-dependent probability distribution p(x,t) for the number of three point shots attempted and another p(x,t) for the number of three-point shots made:

p(x,t)_t = -\left[\mu(t) p(x,t)\right]_x + \left[\frac{\sigma^2(t)}{2} p(x,t)\right]_{xx}


(where subscripts indicate partial derivatives). However, as one will quickly discover, this PDE is not separable!

My alternative approach then was to perform a non-parametric analysis using a kernel density method to fit a cumulative distribution function to each season for the past sixteen seasons.  The following set of plots was generated from this method:

One sees from this analysis, specifically, from the density analysis above, in a given season, the probability that a certain team makes more than 10 3-Point shots per game never seems to exceed 10%, so while the probability of a given team attempting more three point shots may have increased, the probability of the same team making more than say 10 3-Point shots per game has essentially stayed the same over the past number of years.

The question then remains do only “good” / “efficient” teams attempt more three point shots, in particular, does this aid in their attempt to make the playoffs or eventually be a championship-calibre team. This question has been analyzed in detail and has resulted in the following paper, which is now on the arXiv.

Basketball Paper Update

A few weeks ago, I published a paper that used data science / machine learning to detect commonalities between NBA playoff teams. I have now updated and extended it to detect commonalities between NBA championship teams using artificial neural networks, which is a field of deep learning. The paper can be accessed by clicking on the image below.

New Paper on Machine Learning and Basketball

A new and formal paper of mine describing how one can use machine learn methodologies to help determine which NBA teams will make the playoffs is now online: 

  1. arXiv link
  2. SSRN link

Have a look!


How close were The Knicks to making the Playoffs?

It is another New York Knicks season where fans have to wait until next year to see if the Knicks will make the playoffs or not.

Yesterday, there was a lot buzz around the idea that Phil Jackson may want to keep Kurt Rambis on as head coach, and as usual, there were numerous people that were very vocal in their criticism.

However, in actuality, the Knicks were much closer to the playoffs than people realize. A previous post of mine described in detail using data science methodologies the criteria a team must meet to have a high probability of making the playoffs. 

Using the decision tree generated in that post, I evaluated the Knicks playoffs chances this season based on possible playoff criteria scenarios, and found the following:


One sees that a big problem was the Knicks margin of victory, which was too negative. However, even in this case, there are possibilities that existed that would have allowed the Knicks to make the playoffs. For example, a slight increase in the Knicks’ opponent’s field goal attempts or a very slight decrease in the Knicks’ field goal attempts per game would have greatly impacted their playoff chances.

These metrics can easily be adjusted for the upcoming season which will likely require a more organized execution of the triangle offense and discipline on both ends of the floor. They really are almost there!

Stephen Curry and Mahmoud Abdul-Rauf?

As usual, Phil Jackson made another interesting tweet today:

And, as usual received many criticisms from “Experts”, who just looked at the raw numbers from each players, and saw that there is just no way such a statement is justified, but it is not that simple!

When you compare two players (or two objects) who have very different data feature values, it is not that they can’t be compared, you must effectively normalize the data somehow to make the sets comparable.

In this case, I used the data from to compare Chris Jackson’s 6 seasons in Denver to Stephen Curry’s last 6 seasons (including this one) and took into account 45 different statistical measures, and came up with the following correlation matrix/similarity matrix plot:


Dark blue circles indicate a strong correlation, while dark red circles indicate a weak correlation between two sets of features. 

What would be of interest in an analysis like this is to examine the diagonal of this matrix, which offers a direct comparison between the two players: 

One can see that there are many features that have strong correlation coefficients. 

Therefore, it is true that Stephen Curry and Chris Jackson do in fact share many strong similarities!