Live Metrics for NBA Games

Yesterday for the first time, I took the playoff game between Cleveland and Toronto as an opportunity to test out a script I wrote in R that keeps track of key statistics during a game in real time (well, every 30 seconds). Based on previous work, it is evident that championship-calibre teams are the ones that have excellent 2PT-FG% and the ability to draw fouls, so I tracked these during the game, and I came up with the following plot of several time series:


One sees for example that while Toronto started off the game with a much higher 2PT FG%, towards the end Cleveland ended up winning that battle.

A video of this animation is as follows (set the YouTube player to 1080p + FullScreen for Max Quality!)

An interesting question to ask is how are these series correlated? Well, let’s see:

corrplot
In this correlation plot, “pd” indicates point difference, “PF” indicates personal fouls, “2PFG.” indicates 2-Point field goal percentage.

One sees immediately from the correlation plot above that there is a very strong correlation between Cleveland’s point difference  and Toronto’s personal fouls, with some strong correlations attributed to Cleveland’s 2-Point FG% as well.  The equal and opposite is true for Toronto’s point difference. It seems that during a game of this intensity in the playoffs, drawing fouls is a very important factor in determining which team leads and eventually wins in the game combined with 2-Point field goal percentage.

 

 

 

What Do NBA Playoff Teams Have in Common?

I’ve been interested for some time on figuring out an analytical way to determine what characterizes an NBA team as a playoff team. Looking at the previous six seasons, I pulled together almost 65 different statistics that characterize how a team plays, and then performed a classification tree analysis. I found the following result:

  
For the above tree, the misclassification error rate was 2.73%. Also, MOV stands for margin of victory, o3PA is the number of opponent three-point attempts per game, DRtg, is defensive rating, which is the number of points a team allows per 100 possessions, and so on. The data itself was taken from Basketball-Reference.com.

We see that the following patterns emerge among NBA playoff teams over the past number of seasons.

  1. MOV > 2.695
  2. MOV < -0.54, MOV > -1.825, Opponent 3PA > 16.0732, Defensive Rating < 106.05
  3. MOV < -0.54, MOV > -1.825, Opponent 3PA > 16.0732, Defensive Rating > 106.05, FGA < 80.2195
  4. MOV < 2.695, Opponent FGA < 82.0671, MOV < 0.295, Opponent FT > 16.7866
  5. MOV < 2.695, Opponent FGA < 82.0671, MOV > 0.295
  6. MOV < 2.695, Opponent FGA > 82.0671,  Opponent DRB > 29.7683, FGA < 83.128
  7. MOV < 2.695, Opponent FGA > 82.0671,  Opponent DRB > 29.7683, FGA < 83.128, MOV < 2.17

 

Ranking NBA Players

The 2015-2016 NBA season is dawning upon us, and as usual, ESPN has been doing their usual #NBArank, where they are ranking players based on the following non-rigorous methodology:

We asked, “Which player will be better in 2015-16?” To decide, voters had to consider both the quality and quantity of each player’s contributions to his team’s ability to win games. More than 100 voters weighed in on nearly 30,000 pairs of players.

Of course, while I suspect this type of thing has to be just for fun , it has generated a great deal of controversy with many arguments ensuing between fans. For example, Kobe Bryant being ranked 93rd overall in the NBA this year gained a fair deal of criticism from Stephen A. Smith on ESPN First Take.

In general, at least to me, it does not make any sense to rank players from different positions that bring different strengths to a team sport such as basketball. That is, what does it really mean for Tim Duncan to be better than Russell Westbrook (or vice-versa), or Kevin Love to be better than Mike Conley (or vice-versa), etc…

From a mathematical/data science perspective, the only sensible thing to do is to take all the players in the league, and apply a clustering algorithm such as K-means clustering to group players of similar talents and contributions into groups. This is not a trivial thing to do, but it is the sort of thing that data scientists do all the time! For this analysis, I went to Basketball-Reference.com, and pulled out last season’s (2014-2015) per game averages of every player in the league, looking at 25 statistical factors from FGA, FG% to STL, BLK, and TOV. One can see that this is a 25-dimensional problem. 

Our goal then is to consider the problem where denoting C_{1}, ... C_{K} as sets containing the observations in each cluster, we want to solve the optimization problem:

\mbox{minimize}_{C_{1},...C_{k}} \left\{\sum_{k=1}^{K} W(C_{k})\right\},

where W is our distance measure. We use the squared Euclidean distance to define the within-cluster variation, and then solve:

latex-image-28

The first thing to do is to decide how many clusters we want to use in our solution. This is done by looking at the within sum of squares (WSS) plot:

wssplotball

First, we will use 3 clusters in our K-means solution. In this case, the between sum of squares versus total sum of squares ratio was 77.0%, indicating a good “fit”). We use three clusters to begin with, because based on visual inspection, the data clusters very nicely into 3 clusters. The plots obtained were as follows:

3cluster3 3cluster2 3cluster1

The three clusters of players can be found in the following PDF File. Note that the blue circles represent Cluster 1, the red circles represent Cluster 2, and the green circles represent Cluster 3.

Next, we dramatically increase the number of clusters to 20 in our K-means solution.

Performing the K-means clustering, we obtain the following sets of scatter plots. (Note that, it is a bit difficult to display a 25×25 plot on here, so I have split them into a series of plots. Note also, that the between sum of squares versus total sum of squares ratio was 94.8 %, indicating a good “fit”):

clusterplot1

clusterplot4 clusterplot3 clusterplot2

The cluster behaviour can be seen more clearly in three dimensions. We now display some examples:

cluster3d1cluster3d2

 The 20 groups of players we obtained can be seen in the PDF file linked below:

nbastatsnewclusters

The legend for the clusters obtained was:

cluster_legend

Two sample group clusters from our analysis are displayed below in the table. It is interesting that the analysis/algorithm provided that Carmelo Anthony and Kobe Bryant  belong in one group/cluster while LaMarcus Aldridge, Lebron James, and Dwyane Wade belong in another cluster.

Group 16 Group 19
Arron.Afflalo.1 Steven.Adams
Carmelo.Anthony LaMarcus.Aldridge
Patrick.Beverley Bradley.Beal
Chris.Bosh Andrew.Bogut
Kobe.Bryant Jimmy.Butler
Jose.Calderon DeMarre.Carroll
Michael.Carter.Williams.1 Michael.Carter.Williams
Darren.Collison Mike.Conley
Goran.Dragic.1 DeMarcus.Cousins
Langston.Galloway Anthony.Davis
Kevin.Garnett DeMar.DeRozan
Kevin.Garnett.1 Mike.Dunleavy
Jeff.Green.2 Rudy.Gay
George.Hill Eric.Gordon
Jrue.Holiday Blake.Griffin
Dwight.Howard Tobias.Harris
Brandon.Jennings Nene.Hilario
Enes.Kanter.1 Jordan.Hill
Michael.Kidd.Gilchrist Serge.Ibaka
Brandon.Knight.1 LeBron.James
Kevin.Martin Al.Jefferson
Timofey.Mozgov.2 Wesley.Johnson
Rajon.Rondo.2 Brandon.Knight
Derrick.Rose Kawhi.Leonard
J.R..Smith.2 Robin.Lopez
Jared.Sullinger Kyle.Lowry
Thaddeus.Young.1 Wesley.Matthews
Luc.Mbah.a.Moute
Khris.Middleton
Greg.Monroe
Donatas.Motiejunas
Joakim.Noah
Victor.Oladipo
Tony.Parker
Chandler.Parsons
Zach.Randolph
Andre.Roberson
Rajon.Rondo
P.J..Tucker
Dwyane.Wade
Kemba.Walker
David.West
Russell.Westbrook
Deron.Williams

If we use more clusters, players will obviously be placed into smaller groups. The following clustering results can be seen in the linked PDF files.

  1. 50 Clusters – (between_SS / total_SS =  97.4 %) – PDF File
  2. 70 Clusters – (between_SS / total_SS =  97.8 %) – PDF File
  3. 100 Clusters – (between_SS / total_SS =  98.3 %) – PDF File
  4. 200 Clusters (extreme case) – (between_SS / total_SS =  99.1 %) – PDF File

I did not include the visualizations for these computations because they are quite difficult to visualize.

Looking at the 100 Clusters file, we see two interesting results:

  • In Cluster 16, we have: Carmelo Anthony, Chris Bosh, Kobe Bryant and Kevin Martin
  • In Cluster 74, we have: LaMarcus Aldridge, Anthony Davis, Rudy Gay, Blake Griffin, LeBron James and Russell Westbrook

CONCLUSIONS:

We therefore see that is does not make much mathematical/statistical sense to compare and two pairs of players. In my opinion, the only logical thing to do when ranking players is to decide on rankings within clusters. So, based on the above analysis, it makes sense to ask for example whether Carmelo is a better player than Kobe or whether Lebron is a better player than Westbrook, etc… But, based on last season’s statistics, it doesn’t make much sense to ask whether Kobe is a better player than Westbrook, because they have been clustered differently. I think ESPN could benefit tremendously by using a rigorous approach to these sorts of things which spark many conversations because many people take them seriously.

The “Evolution” of the 3-Point Shot in The NBA

The purpose of this post is to determine whether basketball teams who choose to employ an offensive strategy that involves predominantly shooting three point shots is stable and optimal. We employ a game-theoretical approach using techniques from dynamical systems theory to show that taking more three point shots to a point where an offensive strategy is dependent on predominantly shooting threes is not necessarily optimal, and depends on a combination of payoff constraints, where one can establish conditions via the global stability of equilibrium points in addition to Nash equilibria where a predominant two-point offensive strategy would be optimal as well. We perform a detailed fixed-points analysis to establish the local stability of a given offensive strategy. We finally prove the existence of Nash equilibria via global stability techniques via the monotonicity principle. We believe that this work demonstrates that the concept that teams should attempt more three-point shots because a three-point shot is worth more than a two-point shot is therefore, a highly ambiguous statement.

1. Introduction

We are currently living in the age of analytics in professional sports, with a strong trend of their use developing in professional basketball. Indeed, perhaps, one of the most discussed results to come out of the analytics era thus far is the claim that teams should shoot as many three-point shots as possible, largely because, three-point shots are worth more than two-point shots, and this somehow is indicative of a very efficient offense. These ideas were mentioned for example by Alex Rucker who said “When you ask coaches what’s better between a 28 percent three-point shot and a 42 percent midrange shot, they’ll say the 42 percent shot. And that’s objectively false. It’s wrong. If LeBron James just jacked a three on every single possession, that’d be an exceptionally good offense. That’s a conversation we’ve had with our coaching staff, and let’s just say they don’t support that approach.” It was also claimed in the same article that “The analytics team is unanimous, and rather emphatic, that every team should shoot more 3s including the Raptors and even the Rockets, who are on pace to break the NBA record for most 3-point attempts in a season.” These assertions were repeated here. In an article by John Schuhmann, it was claimed that “It’s simple math. A made three is worth 1.5 times a made two. So you don’t have to be a great 3-point shooter to make those shots worth a lot more than a jumper from inside the arc. In fact, if you’re not shooting a layup, you might as well be beyond the 3-point line. Last season, the league made 39.4 percent of shots between the restricted area and the arc, for a value of 0.79 points per shot. It made 36.0 percent of threes, for a value of 1.08 points per shot.” The purpose of this paper is to determine whether basketball teams who choose to employ an offensive strategy that involves predominantly shooting three point shots is stable and optimal. We will employ a game-theoretical approach using techniques from dynamical systems theory to show that taking more three point shots to a point where an offensive strategy is dependent on predominantly shooting threes is not necessarily optimal, and depends on a combination of payoff constraints, where one can establish conditions via the global stability of equilibrium points in addition to Nash equilibria where a predominant two-point offensive strategy would be optimal as well. (Article research and other statistics provided by: Hargun Singh Kohli)

2. The Dynamical Equations

For our model, we consider two types of NBA teams. The first type are teams that employ two point shots as the predominant part of their offensive strategy, while the other type consists of teams that employ three-point shots as the predominant part of their offensive strategy. There are therefore two predominant strategies, which we will denote as {s_{1}, s_{2}}, such that we define

\displaystyle \mathbf{S} = \left\{s_{1}, s_{2}\right\}. \ \ \ \ \ (1)

We then let {n_{i}} represent the number of teams using {s_{i}}, such that the total number of teams in the league is given by

\displaystyle N = \sum_{i =1}^{k} n_{i}, \ \ \ \ \ (2)

which implies that the proportion of teams using strategy {s_{i}} is given by

\displaystyle x_i = \frac{n_{i}}{N}. \ \ \ \ \ (3)

The state of the population of teams is then represented by {\mathbf{x} = (x_{1}, \ldots, x_{k})}. It can be shown that the proportions of individuals using a certain strategy change in time according to the following dynamical system

\displaystyle \dot{x}_{i} = x_{i}\left[\pi(s_{i}, \mathbf{x}) - \bar{\pi}(\mathbf{x})\right], \ \ \ \ \ (4)

subject to

\displaystyle \sum_{i =1}^{k} x_{i} = 1, \ \ \ \ \ (5)

where we have defined the average payoff function as

\displaystyle \bar{\pi}(\mathbf{x}) = \sum_{i=1}^{k} x_{i} \pi(s_{i}, \mathbf{x}). \ \ \ \ \ (6)

Now, let {x_{1}} represent the proportion of teams that predominantly shoot two-point shots, and let {x_{2}} represent the proportion of teams that predominantly shoot three-point shots. Further, denoting the game action set to be {A = \left\{T, Th\right\}}, where {T} represents a predominant two-point shot strategy, and {Th} represents a predominant three-point shot strategy. As such, we assign the following payoffs:

\displaystyle \pi(T,T) = \alpha, \quad \pi(T,Th) = \beta, \quad \pi(Th, T) = \gamma, \quad \pi(Th,Th) = \delta. \ \ \ \ \ (7)

We therefore have that

\displaystyle \pi(T,\mathbf{x}) = \alpha x_{1} + \beta x_{2}, \quad \pi(Th, \mathbf{x}) = \gamma x_{1} + \delta x_{2}. \ \ \ \ \ (8)

From (6), we further have that

\displaystyle \bar{\pi}(\mathbf{x}) = x_{1} \left( \alpha x_{1} + \beta x_{2}\right) + x_{2} \left(\gamma x_{1} + \delta x_{2}\right). \ \ \ \ \ (9)

From Eq. (4) the dynamical system is then given by

\boxed{\dot{x}_{1} = x_{1} \left\{ \left(\alpha x_{1} + \beta x_{2} \right) - x_{1} \left( \alpha x_{1} + \beta x_{2}\right) - x_{2} \left(\gamma x_{1} + \delta x_{2}\right) \right\}},

\boxed{\dot{x}_{2} = x_{2} \left\{ \left( \gamma x_{1} + \delta x_{2}\right) -x_{1} \left( \alpha x_{1} + \beta x_{2}\right) - x_{2} \left(\gamma x_{1} + \delta x_{2}\right) \right\}},

subject to the constraint

\displaystyle x_{1} + x_{2} = 1. \ \ \ \ \ (10)

Indeed, because of the constraint (10), the dynamical system is actually one-dimensional, which we write in terms of {x_{1}} as

\displaystyle \boxed{\dot{x}_{1} = x_{1} \left(-1 + x_{1}\right) \left[\delta + \beta \left(-1 + x_{1}\right) - \delta x_{1} + \left(\gamma-\alpha\right)x_{1}\right]}. \ \ \ \ \ (11)

From Eq. (11), we immediately notice some things of importance. First, we are able to deduce just from the form of the equation what the invariant sets are. We note that for a dynamical system {\mathbf{x}' = \mathbf{f(x)} \in \mathbf{R^{n}}} with flow {\phi_{t}}, if we define a {C^{1}} function {Z: \mathbf{R}^{n} \rightarrow \mathbf{R}} such that {Z' = \alpha Z}, where {\alpha: \mathbf{R}^{n} \rightarrow \mathbf{R}}, then, the subsets of {\mathbf{R}^{n}} defined by {Z > 0, Z = 0}, and {Z < 0} are invariant sets of the flow {\phi_{t}}. Applying this notion to Eq. (11), one immediately sees that {x_1 > 0}, {x_1 = 0}, and {x_1 < 0} are invariant sets of the corresponding flow. Further, there also exists a symmetry such that {x_{1} \rightarrow -x_{1}}, which implies that without loss of generality, we can restrict our attention to {x_{1} \geq 0}.

3. Fixed-Points Analysis

With the dynamical system in hand, we are now in a position to perform a fixed-points analysis. There are precisely three fixed points, which are invariant manifolds and are given by:

\displaystyle P_{1}: x_{1}^{*} = 0, \quad P_{2}: x_{1}^{*} = 1, \quad P_{3}: x_{1}^{*} = \frac{\beta - \delta}{-\alpha + \beta - \delta + \gamma}. \ \ \ \ \ (12)

Note that, {P_{3}} actually contains {P_{1}} and {P_{2}} as special cases. Namely, when {\beta = \delta}, {P_{3} = 0 = P_{1}}, and when {\alpha = \gamma}, {P_{3} = 1 = P_{2}}. We will therefore just analyze, the stability of {P_{3}}. {P_{3} = 0} represents a state of the population where all teams predominantly shoot three-point shots. Similarly, {P_{3} = 1} represents a state of the population where all teams predominantly shoot two-point shots, We additionally restrict

\displaystyle 0 \leq P_{3} \leq 1 \Rightarrow 0 \leq \frac{\beta - \delta}{-\alpha + \beta - \delta + \gamma} \leq 1, \ \ \ \ \ (13)

which implies the following conditions on the payoffs:

\displaystyle \left[\delta < \beta \cap \gamma \leq \alpha \right] \cup \left[\delta = \beta \cap \left(\gamma < \alpha \cup \gamma > \alpha \right) \right] \cup \left[\delta > \beta \cap \gamma \leq \alpha \right]. \ \ \ \ \ (14)

With respect to a stability analysis of {P_{3}}, we note the following. The point {P_{3}} is a: • Local sink if: {\{\delta < \beta\} \cap \{\gamma > \alpha\}}, • Source if: {\{\delta > \beta\} \cap \{\gamma < \alpha\}}, • Saddle: if: {\{\delta = \beta \} \cap (\gamma < \alpha -\beta + \delta \cup \gamma > \alpha - \beta + \delta)}, or {(\{\delta < \beta\} \cup \{\delta > \beta\}) \cap \gamma = \frac{\alpha \delta - \alpha \beta}{\delta - \beta}}.

What this last calculation shows is that the condition \delta = \beta which always corresponds to the point x_{1}^* = 0, which corresponds to a dominant 3-point strategy always exists as a saddle point! That is, there will NEVER be a league that dominantly adopts a three-point strategy, at best, some teams will go towards a 3-point strategy, and others will not irrespective of what the analytics people say. This also shows that a team's basketball strategy really should depend on its respective payoffs, and not current "trends". This behaviour is displayed in the following plot.

Note the saddle point (x1,x2) = (0,1). This clearly shows that all NBA teams will never adopt a dominant 3-point strategy, as it is always more optimal to play to maximize payoffs.
Note the saddle point (x1,x2) = (0,1). This clearly shows that all NBA teams will never adopt a dominant 3-point strategy, as it is always more optimal to play to maximize payoffs.

Further, the system exhibits some bifurcations as well. In the neigbourhood of {P_{3} = 0}, the linearized system takes the form

\displaystyle x_{1}' = \beta - \delta. \ \ \ \ \ (15)

Therefore, {P_{3} = 0} destabilizes the system at {\beta = \delta}. Similarly, {P_{3} = 1} destabilizes the system at {\gamma = \alpha}. Therefore, bifurcations of the system occur on the lines {\gamma = \alpha} and {\beta = \delta} in the four-dimensional parameter space.

4. Global Stability and The Existence of Nash Equilibria

With the preceding fixed-points analysis completed, we are now interested in determining global stability conditions. The main motivation is to determine the existence of any Nash equilibria that occur for this game via the following theorem: If {\mathbf{x}^{*}} is an asymptotically stable fixed point, then the symmetric strategy pair {[\sigma^{*}, \sigma^{*}]}, with {\sigma^{*} = \mathbf{x}^*} is a Nash equilibrium. We will primarily make use of the monotonicity principle, which says let {\phi_{t}} be a flow on {\mathbb{R}^{n}} with {S} an invariant set. Let {Z: S \rightarrow \mathbb{R}} be a {C^{1}} function whose range is the interval {(a,b)}, where {a \in \mathbb{R} \cup \{-\infty\}, b \in \mathbb{R} \cup \{\infty\}}, and {a < b}. If {Z} is decreasing on orbits in {S}, then for all {\mathbf{x} \in S},

\boxed{\omega(\mathbf{x}) \subseteq \left\{\mathbf{s} \in \partial S | \lim_{\mathbf{y} \rightarrow \mathbf{s}} Z(\mathbf{y}) \neq \mathbf{b}\right\}},

\boxed{ \alpha(\mathbf{x}) \subseteq \left\{\mathbf{s} \in \partial S | \lim_{\mathbf{y} \rightarrow \mathbf{s}} Z(\mathbf{y}) \neq \mathbf{a}\right\}}.

Consider the function

\displaystyle Z_{1} = \log \left(-1 + x_{1}\right). \ \ \ \ \ (16)

Then, we have that

\displaystyle \dot{Z}_{1}= x_{1} \left[\delta + \beta \left(-1 + x_{1}\right) - \delta x_{1} + x_{1} \left(\gamma - \alpha\right)\right]. \ \ \ \ \ (17)

For the invariant set {S_1 = \{0 < x_{1} < 1\}}, we have that {\partial S_{1} = \{x_{1} = 0\} \cup \{x_{1} = 1\}}. One can then immediately see that in {S_{1}},

\displaystyle \dot{Z}_{1} < 0 \Leftrightarrow \left\{\beta > \delta\right\} \cap \left\{\alpha \geq \gamma\right\}. \ \ \ \ \ (18)

Therefore, by the monotonicity principle,

\displaystyle \omega(\mathbf{x}) \subseteq \left\{\mathbf{x}: x_{1} = 1 \right\}. \ \ \ \ \ (19)

Note that the conditions {\beta > \delta} and {\alpha \geq \gamma} correspond to {P_{3}} above. In particular, for {\alpha = \gamma}, {P_{3} = 1}, which implies that {x_{1}^{*} = 1} is globally stable. Therefore, under these conditions, the symmetric strategy {[1,1]} is a Nash equilibrium. Now, consider the function

\displaystyle Z_{2} = \log \left(x_{1}\right). \ \ \ \ \ (20)

We can therefore see that

\displaystyle \dot{Z}_{2} = \left[-1 + x_{1}\right] \left[\delta + \beta\left(-1+x_{1}\right) - \delta x_{1} + \left(-\alpha + \gamma\right) x_{1}\right]. \ \ \ \ \ (21)

Clearly, {\dot{Z}_{2} < 0} in {S_{1}} if for example {\beta = \delta} and {\alpha < \gamma}. Then, by the monotonicity principle, we obtain that

\displaystyle \omega(\mathbf{x}) \subseteq \left\{\mathbf{x}: x_{1} = 0 \right\}. \ \ \ \ \ (22)

Note that the conditions {\beta = \delta} and {\alpha < \gamma} correspond to {P_{3}} above. In particular, for {\beta = \delta}, {P_{3} = 0}, which implies that {x_{1}^{*} = 0} is globally stable. Therefore, under these conditions, the symmetric strategy {[0,0]} is a Nash equilibrium. In summary, we have just shown that for the specific case where {\beta > \delta} and {\alpha = \gamma}, the strategy {[1,1]} is a Nash equilibrium. On the other hand, for the specific case where {\beta = \delta} and {\alpha < \gamma}, the strategy {[0,0]} is a Nash equilibrium. 5. Discussion In the previous section which describes global results, we first concluded that for the case where {\beta > \delta} and {\alpha = \gamma}, the strategy {[1,1]} is a Nash equilibrium. The relevance of this is as follows. The condition on the payoffs thus requires that

\displaystyle \pi(T,T) = \pi(Th,T), \quad \pi(T,Th) > \pi(Th,Th). \ \ \ \ \ (23)

That is, given the strategy adopted by the other team, neither team could increase their payoff by adopting another strategy if and only if the condition in (23) is satisfied. Given these conditions, if one team has a predominant two-point strategy, it would be the other team’s best response to also use a predominant two-point strategy. We also concluded that for the case where {\beta = \delta} and {\alpha < \gamma}, the strategy {[0,0]} is a Nash equilibrium. The relevance of this is as follows. The condition on the payoffs thus requires that

\displaystyle \pi(T,Th) = \pi(Th,Th), \quad \pi(T,T) < \pi(Th,T). \ \ \ \ \ (24)

That is, given the strategy adopted by the other team, neither team could increase their payoff by adopting another strategy if and only if the condition in (24) is satisfied. Given these conditions, if one team has a predominant three-point strategy, it would be the other team’s best response to also use a predominant three-point strategy. Further, we also showed that {x_{1} = 1} is globally stable under the conditions in (23). That is, if these conditions hold, every team in the NBA will eventually adopt an offensive strategy predominantly consisting of two-point shots. The conditions in (24) were shown to imply that the point {x_{1} = 0} is globally stable. This means that if these conditions now hold, every team in the NBA will eventually adopt an offensive strategy predominantly consisting of three-point shots. We also provided through a careful stability analysis of the fixed points criteria for the local stability of strategies. For example, we showed that a predominant three-point strategy is locally stable if {\pi(T,Th) - \pi(Th,Th) < 0}, while it is unstable if {\pi(T,Th) - \pi(Th,Th) \geq 0}. In addition, a predominant two-point strategy was found to be locally stable when {\pi(Th,T) - \pi(T,T) < 0}, and unstable when {\pi(Th,T) - \pi(T,T) \geq 0}. There is also they key point of which one of these strategies has the highest probability of being executed. We know that

\displaystyle \pi(\sigma,\mathbf{x}) = \sum_{s \in \mathbf{S}} \sum_{s' \in \mathbf{S}} p(s) x(s') \pi(s,s'). \ \ \ \ \ (25)

That is, the payoff to a team using strategy {\sigma} in a league with profile {\mathbf{x}} is proportional to the probability of this team using strategy {s \in \mathbf{S}}. We therefore see that a team’s optimal strategy would be that for which they could maximize their payoff, that is, for which {p(s)} is a maximum, while keeping in mind the strategy of the other team, hence, the existence of Nash equilibria. Hopefully, this work also shows that the concept that teams should attempt more three-point shots because a three-point shot is worth more than a two-point shot is a highly ambiguous statement. In actuality, one needs to analyze what offensive strategy is optimal which is constrained by a particular set of payoffs.

An Analysis of The 2015 NBA Finals Matchup

The NBA finals are exactly five days away, and I wanted to present an analysis breaking down the matchup between The Golden State Warriors and Cleveland Cavaliers.

I used machine and statistical learning techniques to generate the most probable scenarios for the outcome of each game, and this is what I found.

GSWCLEscenarios

Note that the probabilities listed above are not the probabilities for a team to win a specific game, they are the probabilities of a specific scenario occurring. Also, multiple scenarios can occur in a single game, so the probability of multiple scenarios occurring would be the sum of the individual ones. 

The Model Results So Far (Updated: June 11, 2015)

Game 1: Scenario Outcomes: 1 and 2 – GSW win

Game 2: Scenario Outcome: 9 – CLE win

Game 3: Scenario Outcomes: 5, 8 – CLE win

Thoughts so far: Despite GSW being down right now 2-1, I still believe that Cleveland’s wins were statistical anomalies. Cleveland’s Game 2 and Game 3 wins according to our model only had 1.07%, 9.34%, and 1.765% chances of occurring in this series. Whereas, the GSW Game 1 win had a 44% chance of occurring in this series.

Game 4: Scenario Outcome: 2 – GSW win

Updated: June 14, 2015

Game 5: Scenario Outcomes: 1,2 – GSW win

Thoughts: All of GSW wins have been the dominant scenarios in this series, i.e., Outcomes 1 and 2. All of CLE wins in this series have been statistical anomalies/outliers. This pattern continued in Game 5.

Updated: June 17, 2015

Game 6: Scenario Outcomes: 1,2 – GSW win

Another GSW win through the dominant scenarios in the series, as expected. 

Three-Point Shooting Teams and The 2014-2015 NBA Playoffs

Major Update: June 22, 2015.  I have now published a formal article on the arXiv proving many of the assertions made earlier in this blog post. It can be found here: http://arxiv.org/abs/1506.06687

Some controversy was stirred up today when Knicks President and Basketball coaching legend Phil Jackson made the following tweets regarding three-point shooting teams not doing so well in the second round of the playoffs:

NBA Scores Predictions – April 11, 2015

I am testing out a new algorithm that I have been developing over the past few months that attempts to predict the outcome of sports games, in particular, NBA games. I am taking it out for a “Test Run” today. Here is what I predict:

aprl112015predictions

Probabilities in principle are not too difficult to predict assuming you have the correct algorithm! What is more challenging is trying to predict the scores. Here is my prediction for the individual game outcomes:

Team 1

Team 2

Point Difference

p1

p2

Result

MIA

TOR

2.565

0.420

0.580

TOR

CHI

PHI

6.461

0.701

0.299

CHI

NYK

ORL

0.850

0.467

0.533

ORL

MIN

GSW

8.652

0.225

0.775

GSW

MEM

LAC

2.008

0.432

0.568

LAC

UTA

POR

4.115

0.342

0.658

POR

Note: p1 and p2 denote probabilities of each team winning.