And, as usual received many criticisms from “Experts”, who just looked at the raw numbers from each players, and saw that there is just no way such a statement is justified, but it is not that simple!
When you compare two players (or two objects) who have very different data feature values, it is not that they can’t be compared, you must effectively normalize the data somehow to make the sets comparable.
In this case, I used the data from Basketball-Reference.com to compare Chris Jackson’s 6 seasons in Denver to Stephen Curry’s last 6 seasons (including this one) and took into account 45 different statistical measures, and came up with the following correlation matrix/similarity matrix plot:
Dark blue circles indicate a strong correlation, while dark red circles indicate a weak correlation between two sets of features.
What would be of interest in an analysis like this is to examine the diagonal of this matrix, which offers a direct comparison between the two players:
One can see that there are many features that have strong correlation coefficients.
Therefore, it is true that Stephen Curry and Chris Jackson do in fact share many strong similarities!
I’ve been interested for some time on figuring out an analytical way to determine what characterizes an NBA team as a playoff team. Looking at the previous six seasons, I pulled together almost 65 different statistics that characterize how a team plays, and then performed a classification tree analysis. I found the following result:
For the above tree, the misclassification error rate was 2.73%. Also, MOV stands for margin of victory, o3PA is the number of opponent three-point attempts per game, DRtg, is defensive rating, which is the number of points a team allows per 100 possessions, and so on. The data itself was taken from Basketball-Reference.com.
We see that the following patterns emerge among NBA playoff teams over the past number of seasons.
The vast majority of NBA analysts claim today that the NBA has changed. It has become more fast-paced, and there is a significantly greater emphasis on teams attempting more three point shots. The evidence for this is the repeated recital of the fact that over the last number of years, the average three-point attempt rate has increased. An example of such an article can be found here.
It is my hypothesis that this is all based on a very shallow analysis of what is actually going on. In particular, there are more than 60 variables on Basketball-Reference.com that classify each team’s play. It seems strange that analysts have picked out one statistic, noticed a trend, and have made conclusions ushering in the “modern-day” NBA. As I will demonstrate below, using concepts from statistical and machine learning, many things have been missed in their analyses. What is even more strange is that there have been an increasing number of articles claiming that, for example, if teams do not shoot more three point shots, they will probably not make the playoffs or win a championship. Examples of such articles can be found here, here, and here.
I will now demonstrate why all of these analyses are incomplete, and why their conclusions are wholly incorrect.
Using the great service provided by Basketball-Reference.com, I looked at the last 15 seasons of every NBA team, looking at more than 60 predictor variables that classified each team’s performance in the season. Some of these included: MP FG FGA FG% 3P 3PA 3P% 2P 2PA 2P% FT FTA FT% ORB DRB TRB AST STL BLK TOV PF PTS PTS/G oG oMP oFG oFGA oFG% o3P o3PA o3P% o2P o2PA o2P% oFT oFTA oFT% oORB oDRB oTRB oAST oSTL oBLK oTOV oPF oPTS oPTS/G MOV SOS SRS ORtg DRtg Pace FTr 3PAr TOV% ORB% FT/FGA TOV% DRB% FT/FGA, where a small “o” indicates a team’s opponent’s statistics.
What classifies a playoff team?
Building a classification tree, I wanted to analyze what factors specifically lead to a team making the playoffs in a given season. I found the following:
(For this classification tree, the misclassification error rate was 2.73% indicating a good fit to the data.)
At the top of the tree, we see that the distinguishing factor is the average MOV/”Margin of Victory” measured per game. Teams that on average beat their opponents by more than 2.695 points are predicted to make the playoffs, while teams that on average lose by more than 1.825 points are predicted to not make the playoffs. Further, the only factor relating to three-point shooting in this entire classification tree is the o3PA, which is the number of opponent 3-point attempts per game. For example, suppose a team can has an average MOV of less than -0.54 but greater than -1.825. If that team’s opponent attempts more than 16.0732 3-point shots per game, the team is expected to make the playoffs. In this particular case, getting your opponent to take a lot of three point shots is indeed desirable, and leads to the expectation of a team making the playoffs.
What classifies a championship team?
The next question to analyze is what characteristics/features classify a championship team. Looking at the last 20 years of playoff data, we see that the following classification tree describes the championship criteria for a given NBA playoff team.
(The learning error rate was 1.172% indicating an excellent fit to the data). One sees that at the very top is a team opponent’s field goal percentage (OFG.). If the average per game OFG% is greater than 44.95%, that team is predicted to not win a championship. Further, there are apparently three predicted paths to a championship:
OFG% < 44.95 –> ORtg (Opponent Team Points Scored per 100 possessions) < 108.55 –> FT% < 73.5% –> Opponent Offensive Rebounds per game (OORB) < 30.2405 –> Personal Fouls per game (PF) < 24.1467
This shows once again that the three point shot is not at all relevant in winning a championship amongst playoff teams, in that, shooting a lot of threes, or playing as a “modern” team, does not uniquely determine a team’s success. What is tremendously important is defense, and offensive efficiency, and there are multiple ways to achieve this. One does not need to be a prolific three-point shooting team to achieve these metrics.
The increasing trend of teams shooting more threes and playing at a higher pace still does not uniquely determine whether a team will make the playoffs or win a championship, which is why I have called it a “delusion”. Indeed, the common statement that “nowadays, teams that make the playoffs also have the highest number of three-point shot attempts” is a very shallow statement, and is not actually why teams make the playoffs as this analysis very clearly shows. Further, attempting more three-point shots is not at all uniquely indicative of a team’s success in winning a championship.
The first thing to note is that just by looking at Basketball-Reference.com there are 62 factors that uniquely classify a team: MP FG FGA FG% 3P 3PA 3P% 2P 2PA 2P% FT FTA FT% ORB DRB TRB AST STL BLK TOV PF PTS OMP OFG OFGA OFG% O3P O3PA O3P% O2P O2PA O2P% OFT OFTA OFT% OORB ODRB OTRB OAST OSTL OBLK OTOV OPF OPTS PW PL MOV SOS SRS ORtg DRtg Pace FTr 3PAr eFG% TOV% ORB% FT/FGA eFG% TOV% DRB% FT/FGA, where OFGA indicates a given team’s opponent’s FGA per game average for a specific season.
The reason it is not meaningful to look at a specific statistic or a pair of statistics such as “three-point attempt rate” is that,
possible comparisons can be made.
Because of this, what is required is a detailed statistic learning approach. I looked at the full season statistics for the last twenty NBA champions from the 1995-1996 Chicago Bulls to the 2014-2015 Golden State Warriors.
I employed principle compoent analysis (PCA) to reduce the number of dimensions to see which variables contribute most to the variance of the data set. I found that the first 7 of 20 principle compoents explained 88.52% of the variance. Therefore, we can effectively reduce the dimension of the data set from 63 to 7. This can be seen in the scree plot below:
A visualization of the 63-variable data set is as follows:
The power of principle components analysis reduced this high-dimensional dataset to a more manageable (but, perhaps still complicated) 7-dimensional data set, visualized as follows:
Next, I computed the Euclidean distance metric to perform hierarchical clustering on these seven principle components. I obtained the following result:
We notice immediately that:
The 2015 Golden State Warriors were very similar to the 2014 San Antonio Spurs.
Not surprisingly, Phil Jackson’s 2000 and 2002 Lakers teams were very similar to each other but not to any other championship team, and similarly for his 2009 and 2010 Lakers teams.
Interestingly, the two teams that stand out which are truly dissimilar to any other championship team are the 2008 Boston Celtics and the 1998 Chicago Bulls.
This analysis also eliminates the notion that a team has to play a specific style, for example “modern-day play” to win a championship. In principle, there are many possible ways and styles that lead to a championship and an analysis such as this deeply probing the data shows this to be the case.
The purpose of this post is to determine whether basketball teams who choose to employ an offensive strategy that involves predominantly shooting three point shots is stable and optimal. We employ a game-theoretical approach using techniques from dynamical systems theory to show that taking more three point shots to a point where an offensive strategy is dependent on predominantly shooting threes is not necessarily optimal, and depends on a combination of payoff constraints, where one can establish conditions via the global stability of equilibrium points in addition to Nash equilibria where a predominant two-point offensive strategy would be optimal as well. We perform a detailed fixed-points analysis to establish the local stability of a given offensive strategy. We finally prove the existence of Nash equilibria via global stability techniques via the monotonicity principle. We believe that this work demonstrates that the concept that teams should attempt more three-point shots because a three-point shot is worth more than a two-point shot is therefore, a highly ambiguous statement.
We are currently living in the age of analytics in professional sports, with a strong trend of their use developing in professional basketball. Indeed, perhaps, one of the most discussed results to come out of the analytics era thus far is the claim that teams should shoot as many three-point shots as possible, largely because, three-point shots are worth more than two-point shots, and this somehow is indicative of a very efficient offense. These ideas were mentioned for example by Alex Rucker who said “When you ask coaches what’s better between a 28 percent three-point shot and a 42 percent midrange shot, they’ll say the 42 percent shot. And that’s objectively false. It’s wrong. If LeBron James just jacked a three on every single possession, that’d be an exceptionally good offense. That’s a conversation we’ve had with our coaching staff, and let’s just say they don’t support that approach.” It was also claimed in the same article that “The analytics team is unanimous, and rather emphatic, that every team should shoot more 3s including the Raptors and even the Rockets, who are on pace to break the NBA record for most 3-point attempts in a season.” These assertions were repeated here. In an article by John Schuhmann, it was claimed that “It’s simple math. A made three is worth 1.5 times a made two. So you don’t have to be a great 3-point shooter to make those shots worth a lot more than a jumper from inside the arc. In fact, if you’re not shooting a layup, you might as well be beyond the 3-point line. Last season, the league made 39.4 percent of shots between the restricted area and the arc, for a value of 0.79 points per shot. It made 36.0 percent of threes, for a value of 1.08 points per shot.” The purpose of this paper is to determine whether basketball teams who choose to employ an offensive strategy that involves predominantly shooting three point shots is stable and optimal. We will employ a game-theoretical approach using techniques from dynamical systems theory to show that taking more three point shots to a point where an offensive strategy is dependent on predominantly shooting threes is not necessarily optimal, and depends on a combination of payoff constraints, where one can establish conditions via the global stability of equilibrium points in addition to Nash equilibria where a predominant two-point offensive strategy would be optimal as well. (Article research and other statistics provided by: Hargun Singh Kohli)
2. The Dynamical Equations
For our model, we consider two types of NBA teams. The first type are teams that employ two point shots as the predominant part of their offensive strategy, while the other type consists of teams that employ three-point shots as the predominant part of their offensive strategy. There are therefore two predominant strategies, which we will denote as , such that we define
We then let represent the number of teams using , such that the total number of teams in the league is given by
which implies that the proportion of teams using strategy is given by
The state of the population of teams is then represented by . It can be shown that the proportions of individuals using a certain strategy change in time according to the following dynamical system
where we have defined the average payoff function as
Now, let represent the proportion of teams that predominantly shoot two-point shots, and let represent the proportion of teams that predominantly shoot three-point shots. Further, denoting the game action set to be , where represents a predominant two-point shot strategy, and represents a predominant three-point shot strategy. As such, we assign the following payoffs:
We therefore have that
From (6), we further have that
From Eq. (4) the dynamical system is then given by
subject to the constraint
Indeed, because of the constraint (10), the dynamical system is actually one-dimensional, which we write in terms of as
From Eq. (11), we immediately notice some things of importance. First, we are able to deduce just from the form of the equation what the invariant sets are. We note that for a dynamical system with flow , if we define a function such that , where , then, the subsets of defined by , and are invariant sets of the flow . Applying this notion to Eq. (11), one immediately sees that , , and are invariant sets of the corresponding flow. Further, there also exists a symmetry such that , which implies that without loss of generality, we can restrict our attention to .
3. Fixed-Points Analysis
With the dynamical system in hand, we are now in a position to perform a fixed-points analysis. There are precisely three fixed points, which are invariant manifolds and are given by:
Note that, actually contains and as special cases. Namely, when , , and when , . We will therefore just analyze, the stability of . represents a state of the population where all teams predominantly shoot three-point shots. Similarly, represents a state of the population where all teams predominantly shoot two-point shots, We additionally restrict
which implies the following conditions on the payoffs:
With respect to a stability analysis of , we note the following. The point is a: • Local sink if: , • Source if: , • Saddle: if: , or .
What this last calculation shows is that the condition which always corresponds to the point , which corresponds to a dominant 3-point strategy always exists as a saddle point! That is, there will NEVER be a league that dominantly adopts a three-point strategy, at best, some teams will go towards a 3-point strategy, and others will not irrespective of what the analytics people say. This also shows that a team's basketball strategy really should depend on its respective payoffs, and not current "trends". This behaviour is displayed in the following plot.
Further, the system exhibits some bifurcations as well. In the neigbourhood of , the linearized system takes the form
Therefore, destabilizes the system at . Similarly, destabilizes the system at . Therefore, bifurcations of the system occur on the lines and in the four-dimensional parameter space.
4. Global Stability and The Existence of Nash Equilibria
With the preceding fixed-points analysis completed, we are now interested in determining global stability conditions. The main motivation is to determine the existence of any Nash equilibria that occur for this game via the following theorem: If is an asymptotically stable fixed point, then the symmetric strategy pair , with is a Nash equilibrium. We will primarily make use of the monotonicity principle, which says let be a flow on with an invariant set. Let be a function whose range is the interval , where , and . If is decreasing on orbits in , then for all ,
Consider the function
Then, we have that
For the invariant set , we have that . One can then immediately see that in ,
Therefore, by the monotonicity principle,
Note that the conditions and correspond to above. In particular, for , , which implies that is globally stable. Therefore, under these conditions, the symmetric strategy is a Nash equilibrium. Now, consider the function
We can therefore see that
Clearly, in if for example and . Then, by the monotonicity principle, we obtain that
Note that the conditions and correspond to above. In particular, for , , which implies that is globally stable. Therefore, under these conditions, the symmetric strategy is a Nash equilibrium. In summary, we have just shown that for the specific case where and , the strategy is a Nash equilibrium. On the other hand, for the specific case where and , the strategy is a Nash equilibrium. 5. Discussion In the previous section which describes global results, we first concluded that for the case where and , the strategy is a Nash equilibrium. The relevance of this is as follows. The condition on the payoffs thus requires that
That is, given the strategy adopted by the other team, neither team could increase their payoff by adopting another strategy if and only if the condition in (23) is satisfied. Given these conditions, if one team has a predominant two-point strategy, it would be the other team’s best response to also use a predominant two-point strategy. We also concluded that for the case where and , the strategy is a Nash equilibrium. The relevance of this is as follows. The condition on the payoffs thus requires that
That is, given the strategy adopted by the other team, neither team could increase their payoff by adopting another strategy if and only if the condition in (24) is satisfied. Given these conditions, if one team has a predominant three-point strategy, it would be the other team’s best response to also use a predominant three-point strategy. Further, we also showed that is globally stable under the conditions in (23). That is, if these conditions hold, every team in the NBA will eventually adopt an offensive strategy predominantly consisting of two-point shots. The conditions in (24) were shown to imply that the point is globally stable. This means that if these conditions now hold, every team in the NBA will eventually adopt an offensive strategy predominantly consisting of three-point shots. We also provided through a careful stability analysis of the fixed points criteria for the local stability of strategies. For example, we showed that a predominant three-point strategy is locally stable if , while it is unstable if . In addition, a predominant two-point strategy was found to be locally stable when , and unstable when . There is also they key point of which one of these strategies has the highest probability of being executed. We know that
That is, the payoff to a team using strategy in a league with profile is proportional to the probability of this team using strategy . We therefore see that a team’s optimal strategy would be that for which they could maximize their payoff, that is, for which is a maximum, while keeping in mind the strategy of the other team, hence, the existence of Nash equilibria. Hopefully, this work also shows that the concept that teams should attempt more three-point shots because a three-point shot is worth more than a two-point shot is a highly ambiguous statement. In actuality, one needs to analyze what offensive strategy is optimal which is constrained by a particular set of payoffs.
It is without question that the greatest team in NBA history was the 1995-1996 Chicago Bulls. They went 72-10 that year and went on to win the NBA Championship against a top-notch Seattle Supersonics team.
Phil Jackson’s system and first-class coaching were the major reasons why the Bulls were so good, but I wanted to analyze their reason for winning using data science methodologies.
The results that I found were very interesting. First, I mined through each individual game’s data to obtain patterns in the Bulls wins and losses, and this is what I found:
One sees that the Bulls were a defensive nightmare, and if you look at these results in detail, it makes sense that the Sonics were really the only team that ever posed a threat to them. This shows that to beat the Bulls, the opposing team would have to simultaneously:
Ensure Ron Harper had a FG% less than 44.95% in a game,
Ensure Dennis Rodman would have less than 17 total rebounds in a game,
Ensure Luc Longley had less than 2 blocks in a game,
Ensure Michael Jordan had a FG% less than 46.55% in a game.
If any one of these conditions were not met, the Bulls would win!
This analysis on some level also dispels the notion espoused by several sports analysts like Skip Bayless of ESPN who continually claim that the Bulls’ sole reason for success was Michael Jordan. Ron Harper’s contributions although of paramount importance are rarely mentioned nowadays.
This analysis also shows that the key to the success of the Bulls was not necessarily the number of points that Jordan scored, but the incredible efficiency with which he scored them.
A boosting algorithm also allows us to deduce the most important characteristics in the Bulls’ quality of play and whether they would win or lose a game. The results are as follows:
We see that a key feature of the Bulls’ quality of play depends on how efficient Ron Harper in terms of his FG%.
It is quite interesting that this analysis shows that winning a championship is not about one player, sure, every team needs great players, but the Bulls were a great team, consisting of many great components working together.