The “Interference” of Phil Jackson

By: Dr. Ikjyot Singh Kohli

So, I came across this article today by Matt Moore on CBSSports, who basically once again has taken to the web to bash the Triangle Offense. Of course, much of what he claims (like much of the Knicks media) is flat-out wrong based on very primitive and simplistic analysis, and I will point it out below. Further, much of this article seems to motivated by several comments Carmelo Anthony made recently expressing his dismay at Jeff Hornacek moving away from the “high-paced” offense that the Knicks were running before the All-Star break:

“I think everybody was trying to figure everything out, what was going to work, what wasn’t going to work,’’ Anthony said in the locker room at the former Delta Center. “Early in the season, we were winning games, went on a little winning streak we had. We were playing a certain way. We went away from that, started playing another way. Everybody was trying to figure out: Should we go back to the way we were playing, or try to do something different?’’

Anthony suggested he liked the Hornacek way.

“I thought earlier we were playing faster and more free-flow throughout the course of the game,’’ Anthony said. “We kind of slowed down, started settling it down. Not as fast. The pace slowed down for us — something we had to make an adjustment on the fly with limited practice time, in the course of a game. Once you get into the season, it’s hard to readjust a whole system.’’

First, it is well-known that the Knicks have been implementing more of the triangle offense since All-Star break. All-Star Weekend was Feb 17-19, 2017. The Knicks record before All-Star weekend was amusingly 23-34, which is 11 games below .500 and is nowhere mentioned in any of these articles, and is also not mentioned (realized?) by Carmelo. 

Anyhow, the question is as follows. If Hornacek was allowed to continue is non-triangle ways of pushing the ball/higher pace (What Carmelo claims he liked), would the Knicks have made the playoffs? Probably not. I claim this to be the case based on a detailed machine-learning-based analysis of playoff-eligible teams that has been available for sometime now. In fact, what is perhaps of most importance from this paper is the following classification tree that determines whether a team is playoff-eligible or not:

img_4304

So, these are the relevant factors in determining whether or not a team in a given season makes the playoffs. (Please see the paper linked above for details on the justification of these results.)

Looking at these predictor variables for the Knicks up to the All-Star break.

  1. Opponent Assists/Game: 22.44
  2. Steals/Game: 7.26
  3. TOV/Game: 13.53
  4. DRB/Game: 33.65
  5. Opp.TOV/Game: 12.46

Since Opp.TOV/Game = 12.46 < 13.16, the Knicks would actually be predicted to miss the NBA Playoffs. In fact, if current trends were allowed to continue, the so-called “Hornacek trends”, one can compute the probability of the Knicks making the playoffs:

knickspdfoTOV1

From this probability density function, we can calculate that the probability of the Knicks making the playoffs was 36.84%. The classification tree also predicted that the Knicks would miss the playoffs. So, what is being missed by Carmelo, Matt Moore, and the like is the complete lack of pressure defense, hence, the insufficient amount of opponent TOV/G. So, it is completely incorrect to claim that the Knicks were somehow “Destined for glory” under Hornacek’s way of doing this. This is exacerbated by the fact that the Knicks’ opponent AST/G pre-All-Star break was already pretty high at 22.44.

The question now is how have the Knicks been doing since Phil Jackson’s supposed interference and since supposedly implementing the triangle in a more complete sense? (On a side note, I still don’t think you can partially implement the triangle, I think it needs a proper off-season implementation as it is a complete system).

Interestingly enough, the Knicks opponent assists per game (which, according to the machine learning analysis is the most relevant factor in determining whether a team makes the playoffs) from All-Star weekend to the present-day is an impressive 20.642/Game. By the classification tree above, this actually puts the Knicks safely in playoff territory, in the sense of being classified as a playoff team, but it is too little, too late.

The defense has actually improved significantly with respect to the key relevant statistic of opponent AST/G. (Note that, as will be shown in a future article, DRTG and ORTG are largely useless statistics in determining a team’s playoff eligibility, another point completely missed in Moore’s article) since the Knicks have started to implement the triangle more completely.

The problem is that it is obviously too little, too late at this point. I would argue based on this analysis, that Phil Jackson should have actually interfered earlier in the season. In fact, if the Knicks keep their opponent Assists/game below 20.75/game next season (which is now very likely, if current trends continue), the Knicks would be predicted to make the playoffs by the above machine learning analysis. 

Finally, I will just make this point. It is interesting to look at Phil Jackson teams that were not filled/packed with dominant players. As the saying goes, unfortunately, “Phil Jackson’s success had nothing to do with the triangle, but, because he had Shaq/Kobe, Jordan/Pippen, etc… ”

Well, let’s first look at the 1994-1995 Chicago Bulls, a team that did not have Michael Jordan, but ran the triangle offense completely. Per the relevant statistics above:

  1. Opp. AST/G = 20.9
  2. STL/G = 9.7
  3. AST/G = 24.0
  4. Opp. TOV/G = 18.1

These are remarkable defensive numbers, which supports Phil’s idea, that the triangle offense leads to good defense.

 

 

NCAA March Madness 2017 Predictions

By: Dr. Ikjyot Singh Kohli

Update: March 18, 2017: In a stunning upset, Wisconsin just beat Villanova. It is easy to see why this happened based on the factor relevance diagram below. To win games, Villanova has relied heavily on moving the ball, while Wisconsin has relied heavily on opposing assists! Wisconsin had a minor 5 assists in the whole game today, great defense by them.

wisconsinvillanovafactors.png

 

 

Original Article: March 16, 2017

So, I’m a bit late this year with these, but, it’s only the first day of the tournament as I write this (teaching 2 courses in 1 semester tends to take up A LOT of one’s time!). Anyways, I tried to use Machine Learning methodologies such as neural networks to make predictions on who is going to win the NCAA tournament this year.

To do this, I trained a neural network model on the last 17 seasons of NCAA regular-season team data.

The first thing that I found was what are the most relevant predictor variables in a team’s NCAA championship success:

  1. Free Throws Made : 99.99% relevance
  2. Opponent Assists : 55.86% relevance
  3. Opponent Field Goal Attempts : 31.44% relevance
  4. Free Throws Attempted : -83.13% relevance
  5. Opponent Field Goals Made: -69.2% relevance

It is interesting that the most important factor in deciding whether or not a team wins the NCAA tournament is actually free throw percentage. In other words, schools that have a knack for shooting a high free throw percentage seem to have the highest probability of winning the NCAA tournament. (Point 1 and Point 4 in the list above translates to having a high free throw percentage.) Obviously, with a neural network the relationship between these predictors and the output is not necessarily linear, so other factors could play a strong role as well.

The neural network structure used looked like this:

Now, for the results:

School Name

Probability of Winning Tournament

Villanova 0.9294916774
Gonzaga 0.8076801
Baylor 0.716319
Arizona 0.5516670309
Duke 0.005617711
Saint Mary’s 0.0048923492
Wichita St. 0.001208123
Purdue 0.001180955
SMU 0.0008327729
North Carolina 0.0006080225
UCLA 0.0003794108
S. Dakota St. 0.0003186754
Oregon 0.0002288606
Princeton 0.0002107522
Wisconsin 0.000206285
Northwestern 0.0001878604
Cincinnati 0.0001875887
Marquette 0.0001828106
Virgnia 0.0001532999
Kent St. 0.0001353252
Miami 0.0001338989
Fla. Gulf Coast 0.0001308963
Vermont 0.0001288239
Notre Dame 0.0001278009
Minnesota 0.0001277032
New Mexico State 0.0001276369
USC 0.0001274456
Middle Tenn. 0.0001268802
Florida 0.0001265646
Texas Southern 0.0001265547
Xavier 0.0001264269
Vanderbilt 0.0001262982
Michigan 0.0001261976
East Tenn. St. 0.0001261878
Nevada 0.0001261331
Butler 0.0001260504
Louisville 0.0001260042
Troy 0.0001259668
Dayton 0.0001259567
Arkansas 0.0001259387
Michigan St. 0.0001259298
Oklahoma St. 0.0001259287
Winthrop 0.0001259213
Iona 0.0001259197
Jacksonville St. 0.0001259174
Creighton 0.0001259092
West Virginia 0.0001259032
North Carolin-Wilmington 0.0001259012
Northern Ky. 0.0001259000
Kansas 0.0001258950
Iowa St 0.0001258950
Bucknell 0.0001258945
Florida St 0.0001258939
Kentucky 0.0001258939
Virginia Tech 0.0001258938
Seton Hall 0.0001258937
Maryland 0.0001258936
North Dakota 0.0001258936
South Carolina 0.0001258935
Rhode Island 0.0001258934
Kansas St. 0.0001258933
Mount St. Mary’s 0.0001258932
VCU 0.0001258931
UC Davis 0.0001258929

This neural network model predicts that the team with the highest probability of winning the NCAA tournament this year is Villanova with a 92.94% chance of winning, followed by Gonzaga with a 80.77% chance of winning, Baylor with a 71.63% chance of winning, and Arizona with a 55.16% chance of winning.

So, What’s Wrong with the Knicks?

By: Dr. Ikjyot Singh Kohli

As I write this post, the Knicks are currently 12th in the Eastern conference with a record of 22-32. A plethora of people are offering the opinions on what is wrong with the Knicks, and of course, most of it being from ESPN and the New York media, most of it is incorrect/useless, here are some examples:

  1. The Bulls are following the Knicks’ blueprint for failure and …
  2. Spike Lee ‘still believes’ in Melo, says time for Phil Jackson to go
  3. 25 reasons being a New York Knicks fan is the most depressing …
  4. Carmelo Anthony needs to escape the Knicks
  5. Another Awful Week for Knicks

A while ago, I wrote this paper based on statistical learning that shows the common characteristics for NBA playoff teams. Basically, I obtained the following important result:

img_4304

This classification tree shows along with arguments in the paper, that while the most important factor in teams making the playoffs tends to be the opponent number of assists per game, there are paths to the playoffs where teams are not necessarily strong in this area. Specifically, for the Knicks, as of today, we see that:

opp. Assists / game : 22.4 > 20. 75, STL / game: 7. 2 < 8.0061, TOV / game : 14.1 < 14.1585, DRB / game: 33.8 > 29.9024, opp. TOV / game: 13.0 < 13.1585.

So, one sees that what is keeping the Knicks out of the playoffs is specifically pressure defense, in that, they are not forcing enough turnovers per game. Ironically, they are very close to the threshold, but, it is not enough.

A probability density approximation of the Knicks’ Opp. TOV/G is as follows:

tovpgameplot1

 

This PDF has the approximate functional form:

P(oTOV) =

knicksotovg

Therefore, by computing:

\int_{A}^{\infty} P(oTOV) d(oTOV),

=

knicksotoverfc,

where Erfc is the complementary error function, and is given by:

erfc(z) = \frac{2}{\sqrt{\pi}} \int_{z}^{\infty} e^{-t^2} dt

 

Given that the threshold for playoff-bound teams is more than 13.1585 opp. TOV/game, setting A = 13 above, we obtain: 0.435. This means that the Knicks have roughly a 43.5% chance of forcing more than 13 TOV in any single game. Similarly, setting A = 14, one obtains: 0.3177. This means that the Knicks have roughly a 31.77% chance of forcing more than 14 TOV in any single game, and so forth.

Therefore, one concludes that while the Knicks problems are defensive-oriented, it is specifically related to pressure defense and forcing turnovers.

 

 By: Dr. Ikjyot Singh Kohli, About the Author

Optimal Positions for NBA Players

I was thinking about how one can use the NBA’s new SportVU system to figure out optimal positions for players on the court. One of the interesting things about the SportVU system is that it tracks player (x,y) coordinates on the court. Presumably, it also keeps track of whether or not a player located at (x,y) makes a shot or misses it. Let us denote a player making a shot by 1, and a player missing a shot by 0. Then, one essentially will have data in the form (x,y, \text{1/0}).

One can then use a logistic regression to determine the probability that a player at position (x,y) will make a shot:

p(x,y) = \frac{\exp\left(\beta_0 + \beta_1 x + \beta_2 y\right)}{1 +\exp\left(\beta_0 + \beta_1 x + \beta_2 y\right)}

The main idea is that the parameters \beta_0, \beta_1, \beta_2 uniquely characterize a given player’s probability of making a shot.

As a coaching staff from an offensive perspective, let us say we wish to position players as to say they have a very high probability of making a shot, let us say, for demonstration purposes 99%. This means we must solve the optimization problem:

\frac{\exp\left(\beta_0 + \beta_1 x + \beta_2 y\right)}{1 +\exp\left(\beta_0 + \beta_1 x + \beta_2 y\right)} = 0.99

\text{s.t. } 0 \leq x \leq 28, \quad 0 \leq y \leq 47

(The constraints are determined here by the x-y dimensions of a standard NBA court).

This has the following solutions:

x = \frac{-1. \beta _0-1. \beta _2 y+4.59512}{\beta _1}, \quad \frac{-1. \beta _0-28. \beta _1+4.59512}{\beta _2} \leq y

with the following conditions:

constraints1

One can also have:

x = \frac{-1. \beta _0-1. \beta _2 y+4.59512}{\beta _1}, \quad y \leq 47

with the following conditions:

constraints2

Another solution is:

x = \frac{-1. \beta _0-1. \beta _2 y+4.59512}{\beta _1}

with the following conditions:

constraints3

The fourth possible solution is:

x = \frac{-1. \beta _0-1. \beta _2 y+4.59512}{\beta _1}

with the following conditions:

constraints4

In practice, it should be noted, that it is typically unlikely to have a player that has a 99% probability of making a shot.

To put this example in more practical terms, I generated some random data (1000 points) for a player in terms of (x,y) coordinates and whether he made a shot from that distance or not. The following scatter plot shows the result of this simulation:

bballoptim5

In this plot, the red dots indicate a player has made a shot (a response of 1.0) from the (x,y) coordinates given, while a purple dot indicates a player has missed a shot from the (x,y) coordinates given (a response of 0.0).

Performing a logistic regression on this data, we obtain that \beta_0 = 0, \beta_1 = 0.00066876, \beta_2 = -0.00210949.

Using the equations above, we see that this player has a maximum probability of 58.7149 \% of making a shot from a location of (x,y) = (0,23), and a minimum probability of 38.45 \% of making a shot from a location of (x,y) = (28,0).

What are the factors behind Golden State’s and Cleveland’s Wins in The NBA Finals

As I write this, Cleveland just won the series 4-3. What was behind each team’s wins and losses in this series?

First, Golden State: A correlation plot of their per game predictor variables versus the binary win/loss outcome is as follows: 


The key information is in the last column of this matrix: 


Evidently, the most important factors in GSW’s winning games were Assists, number of Field Goals made, Field Goal percentage, and steals. The most important factors in GSW losing games this series were number of three point attempts per game (Imagine that!), and number of personal fouls per game. 

Now, Cleveland: A correlation plot of their per game predictor variables versus the binary win/loss outcome is as follows: 


The key information is in the last column of this matrix: 


Evidently, the most important factor in CLE’s wins was their number of defensive rebounds. Following behind this were number of three point shots made, and field goal percentage. There were some weak correlations between Cleveland’s losses and their number of offensive rebounds and turnovers. 

Note that these results are essentially a summary analysis of previous blog postings which tracked individual games. For example, here , here and a first attempt here. 

Basketball Paper Update

A few weeks ago, I published a paper that used data science / machine learning to detect commonalities between NBA playoff teams. I have now updated and extended it to detect commonalities between NBA championship teams using artificial neural networks, which is a field of deep learning. The paper can be accessed by clicking on the image below.

Article on Three-Point Shooting in the Modern-Day NBA

 

Continuing the debate of the value of three-point shooting in today’s NBA, my article analyzing this issue from a mathematical perspective has now been published on the arXiv, check it out!