# Category: Data Science

## Let’s not go overboard with this Trump stuff!

It has certainly become the talk of the town with *some* of the latest polls showing that Donald Trump is leading Hillary Clinton in a hypothetical 2016 matchup.

I decided to run my polling algorithm to simulate 100,000 election matchups between Clinton and Trump. I calibrated my model using a variety of data sources.

These were the results:

Based on these simulations, I conclude that:

I think in the era of the 24-hour news cycle, too much is made of one poll.

## Hillary Clinton Still Has the Best Chance of Being The Democratic Party Nominee in 2016

A great deal of noise has been made in the previous weeks about the surge in the polls of Donald Trump and Bernie Sanders. This has led some people to question whether Hillary Clinton will actually end up being the Democratic party nominee in 2016. This was further evidenced by the fact that Sanders is now leading Clinton in the latest New Hampshire polls.

However, running an analysis on current polling data, I still believe that even though it is very early, Hillary Clinton still has the best chance of being the Democratic party nominee. In fact, running some algorithms against the current data, I found that:

**Hillary Clinton: chance of winning Democratic nomination.**

**Bernie Sanders: chance of winning Democratic nomination.**

These numbers were deduced from an algorithm that used non-parametric methods to obtain the following probability density functions.

Thanks to Hargun Singh Kohli for data compilation and research.

## The “Evolution” of the 3-Point Shot in The NBA

The purpose of this post is to determine whether basketball teams who choose to employ an offensive strategy that involves predominantly shooting three point shots is stable and optimal. We employ a game-theoretical approach using techniques from dynamical systems theory to show that taking more three point shots to a point where an offensive strategy is dependent on predominantly shooting threes is not necessarily optimal, and depends on a combination of payoff constraints, where one can establish conditions via the global stability of equilibrium points in addition to Nash equilibria where a predominant two-point offensive strategy would be optimal as well. We perform a detailed fixed-points analysis to establish the local stability of a given offensive strategy. We finally prove the existence of Nash equilibria via global stability techniques via the monotonicity principle. We believe that this work demonstrates that the concept that teams should attempt more three-point shots because a three-point shot is worth more than a two-point shot is therefore, a highly ambiguous statement.

### 1. Introduction

We are currently living in the age of analytics in professional sports, with a strong trend of their use developing in professional basketball. Indeed, perhaps, one of the most discussed results to come out of the analytics era thus far is the claim that teams should shoot as many three-point shots as possible, largely because, three-point shots are worth more than two-point shots, and this somehow is indicative of a very efficient offense. These ideas were mentioned for example by Alex Rucker who said “When you ask coaches what’s better between a 28 percent three-point shot and a 42 percent midrange shot, they’ll say the 42 percent shot. And that’s objectively false. It’s wrong. If LeBron James just jacked a three on every single possession, that’d be an exceptionally good offense. That’s a conversation we’ve had with our coaching staff, and let’s just say they don’t support that approach.” It was also claimed in the same article that “The analytics team is unanimous, and rather emphatic, that every team should shoot more 3s including the Raptors and even the Rockets, who are on pace to break the NBA record for most 3-point attempts in a season.” These assertions were repeated here. In an article by John Schuhmann, it was claimed that “It’s simple math. A made three is worth 1.5 times a made two. So you don’t have to be a great 3-point shooter to make those shots worth a lot more than a jumper from inside the arc. In fact, if you’re not shooting a layup, you might as well be beyond the 3-point line. Last season, the league made 39.4 percent of shots between the restricted area and the arc, for a value of 0.79 points per shot. It made 36.0 percent of threes, for a value of 1.08 points per shot.” The purpose of this paper is to determine whether basketball teams who choose to employ an offensive strategy that involves predominantly shooting three point shots is stable and optimal. We will employ a game-theoretical approach using techniques from dynamical systems theory to show that taking more three point shots to a point where an offensive strategy is dependent on predominantly shooting threes is not necessarily optimal, and depends on a combination of payoff constraints, where one can establish conditions via the global stability of equilibrium points in addition to Nash equilibria where a predominant two-point offensive strategy would be optimal as well. *(Article research and other statistics provided by: Hargun Singh Kohli)*

### 2. The Dynamical Equations

For our model, we consider two types of NBA teams. The first type are teams that employ two point shots as the predominant part of their offensive strategy, while the other type consists of teams that employ three-point shots as the predominant part of their offensive strategy. There are therefore two predominant strategies, which we will denote as , such that we define

We then let represent the number of teams using , such that the total number of teams in the league is given by

which implies that the proportion of teams using strategy is given by

The state of the population of teams is then represented by . It can be shown that the proportions of individuals using a certain strategy change in time according to the following dynamical system

subject to

where we have defined the average payoff function as

Now, let represent the proportion of teams that predominantly shoot two-point shots, and let represent the proportion of teams that predominantly shoot three-point shots. Further, denoting the game action set to be , where represents a predominant two-point shot strategy, and represents a predominant three-point shot strategy. As such, we assign the following payoffs:

We therefore have that

From (6), we further have that

From Eq. (4) the dynamical system is then given by

,

,

subject to the constraint

Indeed, because of the constraint (10), the dynamical system is actually one-dimensional, which we write in terms of as

From Eq. (11), we immediately notice some things of importance. First, we are able to deduce just from the form of the equation what the invariant sets are. We note that for a dynamical system with flow , if we define a function such that , where , then, the subsets of defined by , and are invariant sets of the flow . Applying this notion to Eq. (11), one immediately sees that , , and are invariant sets of the corresponding flow. Further, there also exists a symmetry such that , which implies that without loss of generality, we can restrict our attention to .

### 3. Fixed-Points Analysis

With the dynamical system in hand, we are now in a position to perform a fixed-points analysis. There are precisely three fixed points, which are invariant manifolds and are given by:

Note that, actually contains and as special cases. Namely, when , , and when , . We will therefore just analyze, the stability of . represents a state of the population where all teams predominantly shoot three-point shots. Similarly, represents a state of the population where all teams predominantly shoot two-point shots, We additionally restrict

which implies the following conditions on the payoffs:

With respect to a stability analysis of , we note the following. The point is a: • Local sink if: , • Source if: , • Saddle: if: , or .

What this last calculation shows is that the condition which always corresponds to the point , which corresponds to a dominant 3-point strategy always exists as a saddle point! That is, there will NEVER be a league that dominantly adopts a three-point strategy, at best, some teams will go towards a 3-point strategy, and others will not irrespective of what the analytics people say. This also shows that a team's basketball strategy really should depend on its respective payoffs, and not current "trends". This behaviour is displayed in the following plot.

Further, the system exhibits some bifurcations as well. In the neigbourhood of , the linearized system takes the form

Therefore, destabilizes the system at . Similarly, destabilizes the system at . Therefore, bifurcations of the system occur on the lines and in the four-dimensional parameter space.

### 4. Global Stability and The Existence of Nash Equilibria

With the preceding fixed-points analysis completed, we are now interested in determining global stability conditions. The main motivation is to determine the existence of any Nash equilibria that occur for this game via the following theorem: If is an asymptotically stable fixed point, then the symmetric strategy pair , with is a Nash equilibrium. We will primarily make use of the monotonicity principle, which says let be a flow on with an invariant set. Let be a function whose range is the interval , where , and . If is decreasing on orbits in , then for all ,

,

.

Consider the function

Then, we have that

For the invariant set , we have that . One can then immediately see that in ,

Therefore, by the monotonicity principle,

Note that the conditions and correspond to above. In particular, for , , which implies that is globally stable. Therefore, under these conditions, the symmetric strategy is a Nash equilibrium. Now, consider the function

We can therefore see that

Clearly, in if for example and . Then, by the monotonicity principle, we obtain that

Note that the conditions and correspond to above. In particular, for , , which implies that is globally stable. Therefore, under these conditions, the symmetric strategy is a Nash equilibrium. In summary, we have just shown that for the specific case where and , the strategy is a Nash equilibrium. On the other hand, for the specific case where and , the strategy is a Nash equilibrium. 5. Discussion In the previous section which describes global results, we first concluded that for the case where and , the strategy is a Nash equilibrium. The relevance of this is as follows. The condition on the payoffs thus requires that

That is, given the strategy adopted by the other team, neither team could increase their payoff by adopting another strategy if and only if the condition in (23) is satisfied. Given these conditions, if one team has a predominant two-point strategy, it would be the other team’s best response to also use a predominant two-point strategy. We also concluded that for the case where and , the strategy is a Nash equilibrium. The relevance of this is as follows. The condition on the payoffs thus requires that

That is, given the strategy adopted by the other team, neither team could increase their payoff by adopting another strategy if and only if the condition in (24) is satisfied. Given these conditions, if one team has a predominant three-point strategy, it would be the other team’s best response to also use a predominant three-point strategy. Further, we also showed that is globally stable under the conditions in (23). That is, if these conditions hold, every team in the NBA will eventually adopt an offensive strategy predominantly consisting of two-point shots. The conditions in (24) were shown to imply that the point is globally stable. This means that if these conditions now hold, every team in the NBA will eventually adopt an offensive strategy predominantly consisting of three-point shots. We also provided through a careful stability analysis of the fixed points criteria for the local stability of strategies. For example, we showed that a predominant three-point strategy is locally stable if , while it is unstable if . In addition, a predominant two-point strategy was found to be locally stable when , and unstable when . There is also they key point of which one of these strategies has the highest probability of being executed. We know that

That is, the payoff to a team using strategy in a league with profile is proportional to the probability of this team using strategy . We therefore see that a team’s optimal strategy would be that for which they could maximize their payoff, that is, for which is a maximum, while keeping in mind the strategy of the other team, hence, the existence of Nash equilibria. **Hopefully, this work also shows that the concept that teams should attempt more three-point shots because a three-point shot is worth more than a two-point shot is a highly ambiguous statement. In actuality, one needs to analyze what offensive strategy is optimal which is constrained by a particular set of payoffs.**

## Article on Three-Point Shooting in the Modern-Day NBA

Continuing the debate of the value of three-point shooting in today’s NBA, my article analyzing this issue from a mathematical perspective has now been published on the arXiv, check it out!

## Some Thoughts On Howard Beck’s Bleacher Report Article

Howard Beck had an interesting article today on Bleacher Report, basically suggesting that the NBA finals, in particular, the current style of play embodied by The Golden State Warriors is somehow a vindication of D’Antoni’s basketball philosophies: “Shoot a lot of threes”, “Shoot in 7 seconds or less”, “Play small lineups”, etc…

While the Warriors have certainly embodied some of these philosophies, my personal opinion is that D’Antoni’s style of play can only be vindicated if there is a clear trend in *championship *teams that reflect these philosophies. As I show below, this is simply not the case.

I looked at the last 15 NBA Champions (from 2000-2014), and tried to see if there were any clear patterns in common between the teams. This is essentially what I found:

Two things that are immediately clear are:

1. There is very little that championship teams have in common!

2. The overwhelming thing that they do have in common is that 14 of the last 15 NBA champions have all been ranked in the Top 10 for Defensive Rating, something that Mike D’Antoni’s coaching philosophy has never really included throughout his years in Phoenix, New York, and Los Angeles.

**This, I believe is the grand point that no one seems to be interested in making, perhaps, because according to the “mainstream”, defensive-oriented basketball, which, by definition is “less-flashy” still is the overwhelming common characteristic amongst championship-winning teams. **

Perhaps, the Warriors will win this year, but as I said above, I do not believe that one year is anywhere near enough to establish a trend and a vindication of D’Antoni’s basketball philosophies.

Further, there were some other things in Beck’s article that I found to be a bit concerning:

He claimed *“Today, coaches speak enthusiastically about “positionless” basketball—whereas 10 years ago, D’Antoni had to sell Marion and Stoudemire on the concept.”*

This is not actually true. The triangle offense is the de facto example of “positionless” basketball, and has been around since the 1940s when Sam Barry introduced it at USC. Phil Jackson and Tex Winter’s Bulls and Lakers teams embodied the concept of positionless basketball. In fact, as can be seen from the diagram below (taken from http://khamel83.tripod.com/intro.htm), players don’t have set positions in the triangle offense. Rather, there are regions based on optimality and spacing:

Many examples can be found from teams playing in the triangle offense system of guards posting up, big men coming out to shoot threes, etc…

## An Analysis of The 2015 NBA Finals Matchup

The NBA finals are exactly five days away, and I wanted to present an analysis breaking down the matchup between The Golden State Warriors and Cleveland Cavaliers.

I used machine and statistical learning techniques to generate the most probable scenarios for the outcome of each game, and this is what I found.

Note that the probabilities listed above are *not* the* *probabilities for a team to win a specific game, they are the probabilities of a specific scenario occurring. Also, multiple scenarios can occur in a single game, so the probability of multiple scenarios occurring would be the sum of the individual ones.

The Model Results So Far (Updated: June 11, 2015)

**Game 1: Scenario Outcomes: 1 and 2 – GSW win**

**Game 2: Scenario Outcome: 9 – CLE win**

**Game 3: Scenario Outcomes: 5, 8 – CLE win**

Thoughts so far: Despite GSW being down right now 2-1, I still believe that Cleveland’s wins were statistical anomalies. Cleveland’s Game 2 and Game 3 wins according to our model only had 1.07%, 9.34%, and 1.765% chances of occurring in this series. Whereas, the GSW Game 1 win had a 44% chance of occurring in this series.

**Game 4: Scenario Outcome: 2 – GSW win**

Updated: June 14, 2015

**Game 5: Scenario Outcomes: 1,2 – GSW win**

Thoughts: All of GSW wins have been the

dominantscenarios in this series, i.e., Outcomes 1 and 2. All of CLE wins in this series have been statistical anomalies/outliers. This pattern continued in Game 5.

Updated: June 17, 2015

*Game 6: Scenario Outcomes: 1,2 – GSW win*

Another GSW win through the

dominantscenarios in the series, as expected.