A Problem With Offensive Rating

Abstract: It is shown that the standard/common definition of team offensive rating/offensive efficiency implies that a team’s offensive rating increases as its opponent’s offensive rebounds increase, which, in principle, should not be the case.

Over the past number of years, the advanced metric known as Offensive Rating has become the standard way of measuring a basketball team’s offensive efficiency. Broadly speaking, it is defined as points scored per 100 possessions. Specifically, for teams, it is defined as (See: https://www.basketball-reference.com/about/ratings.html and https://www.nbastuffer.com/analytics101/possession/ AND https://fansided.com/2015/12/21/nylon-calculus-101-possessions/):

ortg_eqn copy

There is a significant issue with this definition as I now demonstrate. Let us compute the partial derivative of this expression with respect to OppORB, we easily obtain:

partial2

As the denominator is always positive, we would like to examine the numerator. The numerator is always negative due to physical constraints (i.e., can’t have negative points or rebounds!) and if OppFG < OppFGA, which makes intuitive sense. It is only positive if OppFG > OppFGA, which logically cannot happen. Therefore, this numerator is always negative (except for the rare case when OppFG = OppFGA of course), which means that the entire partial derivative is positive.

This means that a team’s offensive rating / offensive efficiency increases as it’s opponent’s offensive rebounds increase. Intuitively, this shouldn’t be the case. If your opponent has a high number of offensive rebounds, this should give you less possessions, and put pressure on you to score, thus resulting in less points overall. The problem is that the more general definition of offensive efficiency is 100*(Points Scored)/(Possessions), which is obviously maximized when possessions is minimized. The problem of course, is that the more detailed definition of possessions implies that this minimization of possessions occurs at the cost of maximizing opponent offensive rebounds, which intuitively should not be the case.

Advertisements

The Probability of An Illegal Immigrant Committing a Crime In The United States

Trump has once again put The U.S. on the world stage this time at the expense of innocent children whose families are seeking asylum. The Trump administration’s justification is that:

 

“They want to have illegal immigrants pouring into our country, bringing with them crime, tremendous amounts of crime.”

 

I decided to try to analyze this statement quantitatively. Indeed, one can calculate the probability that an illegal immigrant will commit a crime within The United States as follows. Let us denote crime (or criminal) by C, while denoting illegal immigrant by ii. Then, by Bayes’ theorem, we have:

\boxed{P(C | ii) = \frac{P(ii | C) P(c)}{P(ii)}}

It is quite easy to find data associated with the various factors in this formula. For example, one finds that

  1. P(ii |c) = 0.21
  2. P(c) = 0.02
  3. P(ii) = 0.037

Putting all of this together, we find that:

P(C|ii) = 0.1135 = 11.35 \%

That is, the probability that an illegal immigrant will commit a crime (of any type) while in The United States is a very low 11.35%.

 

Therefore, Trump’s claim of “tremendous amounts of crime” being brought to The United States by illegal immigrants is incorrect.

 

Note that, the numerical factors used above were obtained from:

  1. https://www.justice.gov/opa/pr/departments-justice-and-homeland-security-release-data-incarcerated-aliens-94-percent-all
  2. https://www.washingtontimes.com/news/2017/aug/1/immigrants-22-percent-federal-prison-population/
  3. https://en.wikipedia.org/wiki/Incarceration_in_the_United_States

 

 

 

The Risk of The 3-Point Shot

As more and more teams are increasing the number of threes they attempt based on some misplaced logical fallacy that this somehow leads to an efficient offense, we show below that it is in fact in a team’s opponent’s interest for a team to attempt as many three point shots as possible.

Looking at this season’s data, let us examine two things. The first thing is the number of points a team’s opponent is expected to score for every three-point shot the other team attempts. We discovered that remarkably, the number of points obeys a lognormal distribution:

\boxed{P(X) = \frac{2.86089 e^{-25.713 (\log (X)-1.3119)^2}}{X}}

This means that for every three point shot your team attempts, the opposing team is expected to score

\boxed{\int X P(X) dX = 1.87475\, -1.87475 \text{erf}(6.75099\, -5.0708 \log (X))}

which comes out to about 3.7495 points. So, for every 3PA by a team, the opponent is expected to score more than 3 points based on the most recent NBA data. Keeping that in mind, we see also by integrating P(x) above that there is a 99.99% probability that the opponent will score more than 2 points for every 3PA by a team, and a 93.693% probability that the opponent will score more than 3 points for every single 3PA by the other team.

This would suggest a significant breakdown of defensive emphasis in the “modern-day” NBA where evidently teams are just interested in playing shot-for-shot basketball, but in a very risky way that is not optimal.

The work so far covered just three-point attempts, but, what are the effects of missing a three-point shot? The number of opponent points per a three-point miss also remarkably obeys a lognormal distribution:

\boxed{P(X) = \frac{2.81227 e^{-24.8464 (\log (X)-1.7605)^2}}{X}}

Therefore, for every three-point shot your team misses, the opposing team is expected to score:

\boxed{\int X P(X) dX = 2.93707\, -2.93707 \text{erf}(8.87571\, -4.98461 \log (X))}

which comes out to about 5.87345 points. This identifies a remarkable risk to a team missing a three-point shot. This computation shows that one three-point shot miss corresponds to about 6 points for the opposing team! Looking at probabilities by integrating the density function above, one can show that there is a 99.9999% probability that the opposing team would score more than two points for every three-point miss, a 99.998% probability that the opposing team would score more than three points for every three-point miss, a 99.583% probability that the opposing team would score more than four points for every three-point miss, and so on.

What these calculations demonstrate is that gearing a team’s offense to focus on attempting three-point shots is remarkably risky, especially if a team misses a three-point shot. Given that the average number of three-point attempts is increasing over the last number of years, but the average number of makes has relatively stayed the same (See this older article here: https://relativitydigest.com/2016/05/26/the-three-point-shot-myth-continued/), teams are exposing themselves to greater and greater risk of losing games by adopting this style of play.

 

 

 

An Equation to Predict NBA Playoff Probabilities

Based on a previous paper I wrote that used machine learning to determine the most relevant factors for teams making the NBA playoffs, I did some further analysis in an attempt to come up with an equation that outputs the probability of an NBA team making the playoffs in a given season.

From the aforementioned paper, one concludes that the two most important factors in determining whether a team makes the playoffs or not is its opponent assists per game and opponent two-point shots made per game. Based on that, I came up with the following equation:

\boxed{P(playoffs) = 0.49 \left[ \frac{1}{1 + \exp\left(-7.6683 +0.2489 o2P   \right)   }   \right] + 0.51 \left[ \frac{1}{1 + \exp\left(-9.1835 +0.4211 oAST   \right)   }   \right]}

A plot of this equation is as follows:

probplot1

A contour plot is perhaps more illuminating:

contourprobplot1

One can see from this contour plot that teams have the highest probabilities of making the playoffs when their opponent 2-point shots and opponent assists are both around 20. In general, we also see that while a team can allow more opponent 2-point shots, having a low number of opponent assists per game is evidently the most important factor.

 

Using this equation, I was able to classify 71% of playoff teams correctly from the last 16 years of NBA data. Even though the playoff classifier developed in the paper mentioned above is more accurate in general, those methods are non-parametric, so, it is difficult to obtain an equation. To get an equation as we have done here, can be extremely useful for modelling purposes and understanding the nature of probabilities in deciding whether a certain team will make the playoffs in a given season. (Also: note that we are using the convention of using 0.50 as the threshold probability, so a probability output of >0.5, is classified as a team making the playoffs.)

When is it optimal to shoot a 3-Point Shot

A very interesting result: computing payoffs of players, the following is a diagram that shows when it is optimal for a player to shoot a 2 point or a 3-point shot. One sees that it is hardly ever optimal for a player to shoot a 3-point shot, since the region corresponding to 3-point optimality is quite narrow. This can be interpreted as saying that for a 3-point attempt to be optimal, a player’s 2PT% must be roughly equal to his/her 3PT%, which is certainly not the case for the vast majority of even designated 3-point shooters in the NBA!

updatedpic32point
The grey region is where shooting a 3-point shot is optimal, the blue region is where shooting a 2-point shot is optimal, and the red line that separates these boundaries is where the payoff is equivalent in both approaches.

What if Michael Jordan Played in Today’s NBA?

By: Dr. Ikjyot Singh Kohli

It seems that one cannot turn on ESPN or any YouTube channel nowadays without the ongoing debate of whether Michael Jordan is better than Lebron, what would happen if Michael Jordan played in today’s NBA, etc… However, I have not seen a single scientific approach to this question. Albeit, it is sort of an impossible question to answer, but, using data science I will try.

From a data science perspective, it only makes sense to look at Michael Jordan’s performance in a single season, and try to predict based on that season how he would perform in the most recent NBA season. That being said, let’s look at Michael Jordan’s game-to-game performance in the 1995-1996 NBA season when the Bulls went 72-10.

Using neural networks and Garson’s algorithm , to regress against Michael Jordan’s per game point total, we note the following:

jordanpoints
In this plot, the “o” stands for opponent.

 

One can see from this variable importance plot, Michael’s points in a given game were most positively associated with teams that committed a high number of turnovers followed by teams that make a lot of 3-point shots. Interestingly, there was not a strong negative factor on Michael’s points in a given game.

Given this information, and the per-game league averages of the 2017 season, we used this neural network to make a prediction on how many points Michael would average in today’s season:

Michael Jordan: 2017 NBA Season Prediction: 32.91 Points / Game (+/- 6.9)

It is interesting to note that Michael averaged 30.4 Points/Game in the 1995-1996 NBA Season. We therefore conclude that the 1995-1996 Michael would average a higher points/game if he played in today’s NBA.

As an aside, a plot of the neural network used to generate these variable importance plots and predictions is as follows:

jordannnet

What about the reverse question? What if the 2016-2017 Lebron James played in the 1995-1996 NBA? What would happen to his per-game point average? Using the same methodology as above, we used neural networks in combination with Garson’s algorithm to obtain a variable importance plot for Lebron James’ per-game point totals:

lebronplot

 

One sees from this plot that Lebron’s points every game were most positively impacted by teams that predominantly committed personal fouls, followed by teams that got a lot of offensive rebounds. There were no predominantly strong negative factors that affected Lebron’s ability to score.

Using this neural network model, we then tried to make a prediction on how many points per game Lebron would score if he played in the 1995-1996 NBA Season:

Lebron James: 1995-1996 NBA Season Prediction: 18.81 Points / Game (+/- 4.796)

This neural network model predicts that Lebron James would average 18.81 Points/Game if he played in the 1995-1996 NBA season, which is a drop from the 26.4 Points/Game he averaged this most recent NBA season.

Therefore, at least from this neural network model, one concludes that Lebron’s per game points would decrease if he played in the 1995-1996 Season, while Michael’s number would increase slightly if he played in the 2016-2017 Season.

So, What’s Wrong with the Knicks?

By: Dr. Ikjyot Singh Kohli

As I write this post, the Knicks are currently 12th in the Eastern conference with a record of 22-32. A plethora of people are offering the opinions on what is wrong with the Knicks, and of course, most of it being from ESPN and the New York media, most of it is incorrect/useless, here are some examples:

  1. The Bulls are following the Knicks’ blueprint for failure and …
  2. Spike Lee ‘still believes’ in Melo, says time for Phil Jackson to go
  3. 25 reasons being a New York Knicks fan is the most depressing …
  4. Carmelo Anthony needs to escape the Knicks
  5. Another Awful Week for Knicks

A while ago, I wrote this paper based on statistical learning that shows the common characteristics for NBA playoff teams. Basically, I obtained the following important result:

img_4304

This classification tree shows along with arguments in the paper, that while the most important factor in teams making the playoffs tends to be the opponent number of assists per game, there are paths to the playoffs where teams are not necessarily strong in this area. Specifically, for the Knicks, as of today, we see that:

opp. Assists / game : 22.4 > 20. 75, STL / game: 7. 2 < 8.0061, TOV / game : 14.1 < 14.1585, DRB / game: 33.8 > 29.9024, opp. TOV / game: 13.0 < 13.1585.

So, one sees that what is keeping the Knicks out of the playoffs is specifically pressure defense, in that, they are not forcing enough turnovers per game. Ironically, they are very close to the threshold, but, it is not enough.

A probability density approximation of the Knicks’ Opp. TOV/G is as follows:

tovpgameplot1

 

This PDF has the approximate functional form:

P(oTOV) =

knicksotovg

Therefore, by computing:

\int_{A}^{\infty} P(oTOV) d(oTOV),

=

knicksotoverfc,

where Erfc is the complementary error function, and is given by:

erfc(z) = \frac{2}{\sqrt{\pi}} \int_{z}^{\infty} e^{-t^2} dt

 

Given that the threshold for playoff-bound teams is more than 13.1585 opp. TOV/game, setting A = 13 above, we obtain: 0.435. This means that the Knicks have roughly a 43.5% chance of forcing more than 13 TOV in any single game. Similarly, setting A = 14, one obtains: 0.3177. This means that the Knicks have roughly a 31.77% chance of forcing more than 14 TOV in any single game, and so forth.

Therefore, one concludes that while the Knicks problems are defensive-oriented, it is specifically related to pressure defense and forcing turnovers.

 

 By: Dr. Ikjyot Singh Kohli, About the Author