## How to Beat the Golden State Warriors

The Golden State Warriors have posed quite the conundrum for opposing teams. They are quick, have a spectacular ability to move the ball, and play suffocating defense. Given their play in the playoffs thus far, all of these points have been exemplified even more to the point where it seems that they are unbeatable.

I wanted to take somewhat of a simplified approach and see if opposing teams are missing something. That is, is their some weakness in their play that opposing teams can exploit, a “weakness in Helm’s deep”?

The most obvious place to start from a data science point-of-view seemed to me to look at every single shot the Warriors took as a team this season in each game and compile a grand ensemble shot chart. Using the data from Basketball-reference.com and some data scraping scripts I wrote in R, I obtained the following:

Certainly, on the surface, it seems that there is no discernible pattern between made shots and missed shots. This is where the machine learning comes in!

From here, I now extracted the x and y coordinates of each shot and recorded a response variable of “made” or “missed” in a table, such that the coordinates were now predictor variables and the shot classification (made/missed) was the response variable. Altogether, we had 7104 observations. Splitting this dataset up into a 70% training dataset and a 30% test data set, I tried the following algorithms, recording the % of correctly classified observations:

 Algorithm % of Correctly Predicted Observations Logistic Regression 56.43 Gradient Boosted Decision Trees 62.62 Random Forests 58.54 Neural Networks with Entropy Fitting 62.47 Naive Bayes Classification with Kernel Density Estimation 57.32

One sees that that gradient boosted decision trees had the best performance correctly classifying 62.62% of the test observations. This is not a spectacular number, but, given how noisy the data is, it is not bad, and much better than expected. I should also mention that these numbers were obtained after tuning these models using cross-validation for optimal parameters.

Using the gradient boosted decision tree model, we made a set of predictions for a vast number of (x,y)-coordinates for basketball court. We obtained the following contour plot:

Overlaying this on top of the basketball court diagram, we got:

The contour plot levels denote the probabilities that the GSW will make a shot from a given (x,y) location on the court. As a sanity check, the lowest probabilities seem to be close to the 1/2-court line and beyond the three-point line. The highest probabilities are surprisingly along very specific areas on the court: very close the basket, the line from the basket to the left corner, extending up slightly, and a very narrow line extending from the basket to the right corner. Interestingly, the probabilities are low on the right side of the basket, specifically:

A map showing the probabilities more explicitly is as follows (although, upon uploading it, I realized it is a bit harder to read, I will re-upload a clearer version soon!)

In conclusion, it seems that, at least according to a first look at the data, the Warriors do indeed have several “weak spots” in their offense that opponents should certainly look to exploit by designing defensive schemes that force them to take shots in the aforementioned low-probability zones. As for future improvements, I think it would be interesting to add as predictor variables things like geographic location, crowd sizes, team opponent strengths, etc… I will look into making these improvements in the near future.

## The “Interference” of Phil Jackson

By: Dr. Ikjyot Singh Kohli

So, I came across this article today by Matt Moore on CBSSports, who basically once again has taken to the web to bash the Triangle Offense. Of course, much of what he claims (like much of the Knicks media) is flat-out wrong based on very primitive and simplistic analysis, and I will point it out below. Further, much of this article seems to motivated by several comments Carmelo Anthony made recently expressing his dismay at Jeff Hornacek moving away from the “high-paced” offense that the Knicks were running before the All-Star break:

“I think everybody was trying to figure everything out, what was going to work, what wasn’t going to work,’’ Anthony said in the locker room at the former Delta Center. “Early in the season, we were winning games, went on a little winning streak we had. We were playing a certain way. We went away from that, started playing another way. Everybody was trying to figure out: Should we go back to the way we were playing, or try to do something different?’’

Anthony suggested he liked the Hornacek way.

“I thought earlier we were playing faster and more free-flow throughout the course of the game,’’ Anthony said. “We kind of slowed down, started settling it down. Not as fast. The pace slowed down for us — something we had to make an adjustment on the fly with limited practice time, in the course of a game. Once you get into the season, it’s hard to readjust a whole system.’’

First, it is well-known that the Knicks have been implementing more of the triangle offense since All-Star break. All-Star Weekend was Feb 17-19, 2017. The Knicks record before All-Star weekend was amusingly 23-34, which is 11 games below .500 and is nowhere mentioned in any of these articles, and is also not mentioned (realized?) by Carmelo.

Anyhow, the question is as follows. If Hornacek was allowed to continue is non-triangle ways of pushing the ball/higher pace (What Carmelo claims he liked), would the Knicks have made the playoffs? Probably not. I claim this to be the case based on a detailed machine-learning-based analysis of playoff-eligible teams that has been available for sometime now. In fact, what is perhaps of most importance from this paper is the following classification tree that determines whether a team is playoff-eligible or not:

So, these are the relevant factors in determining whether or not a team in a given season makes the playoffs. (Please see the paper linked above for details on the justification of these results.)

Looking at these predictor variables for the Knicks up to the All-Star break.

1. Opponent Assists/Game: 22.44
2. Steals/Game: 7.26
3. TOV/Game: 13.53
4. DRB/Game: 33.65
5. Opp.TOV/Game: 12.46

Since Opp.TOV/Game = 12.46 < 13.16, the Knicks would actually be predicted to miss the NBA Playoffs. In fact, if current trends were allowed to continue, the so-called “Hornacek trends”, one can compute the probability of the Knicks making the playoffs:

From this probability density function, we can calculate that the probability of the Knicks making the playoffs was 36.84%. The classification tree also predicted that the Knicks would miss the playoffs. So, what is being missed by Carmelo, Matt Moore, and the like is the complete lack of pressure defense, hence, the insufficient amount of opponent TOV/G. So, it is completely incorrect to claim that the Knicks were somehow “Destined for glory” under Hornacek’s way of doing this. This is exacerbated by the fact that the Knicks’ opponent AST/G pre-All-Star break was already pretty high at 22.44.

The question now is how have the Knicks been doing since Phil Jackson’s supposed interference and since supposedly implementing the triangle in a more complete sense? (On a side note, I still don’t think you can partially implement the triangle, I think it needs a proper off-season implementation as it is a complete system).

Interestingly enough, the Knicks opponent assists per game (which, according to the machine learning analysis is the most relevant factor in determining whether a team makes the playoffs) from All-Star weekend to the present-day is an impressive 20.642/Game. By the classification tree above, this actually puts the Knicks safely in playoff territory, in the sense of being classified as a playoff team, but it is too little, too late.

The defense has actually improved significantly with respect to the key relevant statistic of opponent AST/G. (Note that, as will be shown in a future article, DRTG and ORTG are largely useless statistics in determining a team’s playoff eligibility, another point completely missed in Moore’s article) since the Knicks have started to implement the triangle more completely.

The problem is that it is obviously too little, too late at this point. I would argue based on this analysis, that Phil Jackson should have actually interfered earlier in the season. In fact, if the Knicks keep their opponent Assists/game below 20.75/game next season (which is now very likely, if current trends continue), the Knicks would be predicted to make the playoffs by the above machine learning analysis.

Finally, I will just make this point. It is interesting to look at Phil Jackson teams that were not filled/packed with dominant players. As the saying goes, unfortunately, “Phil Jackson’s success had nothing to do with the triangle, but, because he had Shaq/Kobe, Jordan/Pippen, etc… ”

Well, let’s first look at the 1994-1995 Chicago Bulls, a team that did not have Michael Jordan, but ran the triangle offense completely. Per the relevant statistics above:

1. Opp. AST/G = 20.9
2. STL/G = 9.7
3. AST/G = 24.0
4. Opp. TOV/G = 18.1

These are remarkable defensive numbers, which supports Phil’s idea, that the triangle offense leads to good defense.

## Mathematics Behind The Triangle Offense

It was pointed out to me recently that a few of the articles I have written describing the detailed geometric structure behind the triangle offense is scattered in various places around my blog, so here is a list of the articles in one convenient place:

• The Mathematics of Filling the Triangle (First article)
• Group Theory and Dynamical Systems Theory Behind The Triangle Offense
• A Demonstration That The Triangle Offense is the most efficient/optimal way for 5 players to space the floor.
• By: Dr. Ikjyot Singh Kohli (About the Author)

By: Dr. Ikjyot Singh Kohli (About the Author)

## So, What’s Wrong with the Knicks?

As I write this post, the Knicks are currently 12th in the Eastern conference with a record of 22-32. A plethora of people are offering the opinions on what is wrong with the Knicks, and of course, most of it being from ESPN and the New York media, most of it is incorrect/useless, here are some examples:

A while ago, I wrote this paper based on statistical learning that shows the common characteristics for NBA playoff teams. Basically, I obtained the following important result:

This classification tree shows along with arguments in the paper, that while the most important factor in teams making the playoffs tends to be the opponent number of assists per game, there are paths to the playoffs where teams are not necessarily strong in this area. Specifically, for the Knicks, as of today, we see that:

opp. Assists / game : 22.4 > 20. 75, STL / game: 7. 2 < 8.0061, TOV / game : 14.1 < 14.1585, DRB / game: 33.8 > 29.9024, opp. TOV / game: 13.0 < 13.1585.

So, one sees that what is keeping the Knicks out of the playoffs is specifically pressure defense, in that, they are not forcing enough turnovers per game. Ironically, they are very close to the threshold, but, it is not enough.

A probability density approximation of the Knicks’ Opp. TOV/G is as follows:

This PDF has the approximate functional form:

P(oTOV) =

Therefore, by computing:

$\int_{A}^{\infty} P(oTOV) d(oTOV)$,

=

,

where Erfc is the complementary error function, and is given by:

$erfc(z) = \frac{2}{\sqrt{\pi}} \int_{z}^{\infty} e^{-t^2} dt$

Given that the threshold for playoff-bound teams is more than 13.1585 opp. TOV/game, setting A = 13 above, we obtain: 0.435. This means that the Knicks have roughly a 43.5% chance of forcing more than 13 TOV in any single game. Similarly, setting A = 14, one obtains: 0.3177. This means that the Knicks have roughly a 31.77% chance of forcing more than 14 TOV in any single game, and so forth.

Therefore, one concludes that while the Knicks problems are defensive-oriented, it is specifically related to pressure defense and forcing turnovers.

By: Dr. Ikjyot Singh Kohli, About the Author