As I write this, Cleveland just won the series 4-3. What was behind each team’s wins and losses in this series?

First, Golden State: A correlation plot of their per game predictor variables versus the binary win/loss outcome is as follows:

The key information is in the last column of this matrix:

Evidently, the most important factors in GSW’s winning games were Assists, number of Field Goals made, Field Goal percentage, and steals. The most important factors in GSW losing games this series were number of three point attempts per game (Imagine that!), and number of personal fouls per game.

Now, Cleveland: A correlation plot of their per game predictor variables versus the binary win/loss outcome is as follows:

The key information is in the last column of this matrix:

Evidently, the most important factor in CLE’s wins was their number of defensive rebounds. Following behind this were number of three point shots made, and field goal percentage. There were some weak correlations between Cleveland’s losses and their number of offensive rebounds and turnovers.

Note that these results are essentially a summary analysis of previous blog postings which tracked individual games. For example, here , here and a first attempt here.

I’ve been fascinated by the triangle offense for a long time. I think it is a beautiful way to play basketball, and the right way to play basketball, in the half-court, a “system-based” way to play. For those of you that are interested, I highly recommend Tex Winter’s classic book on the topic.

There is this brief video as well where Tex Winter explains how the triangle offense and a basketball are grounded in geometric principles:

I don’t think people recognize though how deep of a geometry problem this is actually. Looking at when the triangle is filled, as in the video above, we have the following situation:

The problem I wanted to study was given 5 players’ random positions on the court, could a series of equations be solved yielding (x,y) coordinates that would yield where players should “go” to fill the triangle?

Using simple geometry, from the diagram above, we see that each player’s position in the triangle offense is governed by the following system of nonlinear equations:

Further, the angles obviously must satisfy the following constraints:

Finally, we require that each player be about 15-20 feet apart in the triangle offense (because the offense is predicated on spacing), and thus have some additional constraints:

Solving this highly nonlinear system of equations with constraints is not a trivial problem! It fact, because of the high degree of nonlinearity and dimension of the problem, it is safe to assume that no closed-form solution exists, and therefore, must be solved numerically.

For this task, we used MATLAB, and experimented with the lsqnonlin() and fsolve() commands. The only issue is that (as with all such numerical algorithms) convergence depends very highly on the choice of initial conditions. It is very difficult to choose a priori this many initial conditions, so I wrote a script that randomized initial conditions. I then ran several numerical experiments and obtained the following results:

In the plot above, I have labeled the plots that converged to the triangle formation with the title “this one”. In addition, the five black circles denote the initial positions of the players on the court before they fill the triangles in the offense. One sees just by the diagram above, how difficult such a problem is to solve mathematically, even through a numerical approach. Running more trials would perhaps yield better results, but, it works! I am truly fascinated by this. In the coming days, I will work on optimizing the numerical algorithm, and post my updates as they come.

Here is an animation of one of the scenarios above when the algorithm converges correctly:

In this animation above, the black dots represent the positions of the players on the court. They begin at initial (random) positions and attempt to fill the triangles as described above.

As usual, here is the post-game breakdown of Game 2 of the NBA Finals between Cleveland and Golden State. Using my live-tracking app to track the relevant factors (as explained in previous posts) here are the live-captured time series:

Computing the correlations between each time series above and the Golden State Warriors point difference, we obtain:

One sees once again that the most relevant factors to GSW’s point difference in the game was CLE’s personal fouls during the game, GSW’s personal fouls during the game, and not far behind, GSW 3-point percentage during the game. What is interesting is that one can see the importance of these variables played out in real time matching the two graphs above.

In fact, looking at the personal fouls vs. GSW point difference in real time (essentially taking a subset of the time series graph above), we obtain:

Using my live tracking app combined with the relevant factors based on this previous work, here is my breakdown of what contributed to the Warriors win in Game 1 of the NBA Finals.

First, here is the time series graph of several predictor variables:

Breaking this down a bit further, we have:

Computing the correlations, we obtain:

For the graphically inclined:

One sees that the predictor variable correlated most positively with the Warriors’ lead was the number of fouls Cleveland committed. Therefore, evidently, the most important factor in GSW winning Game 1 was the rate and number of fouls committed by Cleveland during the game.