The Trump Rally, Really?

Today, The Dow Jones Industrial Average (DJIA) surpassed the 20,000 mark for the first time in history. At the time of the writing of this posting (12:31 PM on January 25), it is actually 20,058.29, so, I am not sure if it will close above 20,000 points, but, nevertheless, a lot of people are crediting this to Trump’s presidency, but I’m not so sure you can do that. First, the point must be made, that it is really the Obama economic policies that set the stage for this. On January 20, 2009, when Obama was sworn in, the Dow closed at 7949.089844 points. On November 8, 2016, when Trump won the election, the Dow closed at 18332.74023. So, during the Obama administration, the Dow increased by approximately 130.63%. I just wanted to make that point.

Now, the question that I wanted to investigate was would the Dow have closed past 20,000 points had Trump not been elected president. That is, assuming that the Obama administration policies and subsequent effects on the Dow were allowed to continue, would the Dow have surpassed 20,000 points.

For this, I looked at the DJIA data from January 20, 2009 (Obama’s first inauguration) to November 08, 2016 (Trump’s election). I specifically calculated the daily returns and discovered that they are approximately normally distributed using a kernel density method:

obamadowpdf

Importantly, one can calculate that the mean daily returns, \mu = 0.00045497596503813, while the volatility in daily returns, \sigma = 0.0100872666938282. Indeed, the volatility in daily returns for the DJIA was found to be relatively high during this period. Finally, the DJIA closed at 18332.74023 points on election night, November 08, 2016, which was 53 business days ago.

The daily dynamics of the DJIA can be modelled by the following stochastic differential equation:

S_{t} = S_{t-1} + \mu S_{t-1} dt + \sigma S_{t-1} dW,

where dW denotes a Wiener/Brownian motion process. Simulating this on computer, I ran 2,000,000 Monte Carlo simulations to simulate the DJIA closing price 53 business days from November 08, 2016, that is, January 25, 2017. The results of some of these simulations are shown below:

djiaclosingvaluesims

We concluded the following from our simulation. At the end of January 25, 2017, the DJIA was predicted to close at:

18778.51676 \pm 1380.42445

That is, the DJIA would be expected to close anywhere between 17398.0923062336 and 20158.94121. This range, albeit wide, is due to the high volatility of the daily returns in the DJIA, but, as you can see, it is perfectly feasible that the DJIA would have surpassed 20,000 points if Trump would not have been elected president.

Further, perhaps what is of more importance is the probability that the DJIA would surpass 20,000 points at any time during this 54-day period. We found the following:

probofexceeding

One sees that there is an almost 20% (more precisely, 18.53%) probability that the DJIA would close above 20,000 points on January 25, 2017 had Trump not been elected president. Since, by all accounts, the DJIA exceeding 20,000 points is considered to be an extremely rare/historic event, the fact that the probability is found to be almost 20% is actually quite significant, and shows, that it is quite likely that a Trump administration actually has little to do with the DJIA exceeding 20,000 points.

Although, this simulation was just for 53 working days from Nov 08, 2016, one can see that the probability of the DJIA exceeding 20,000 at closing day is monotonically increasing with every passing day. It is therefore quite feasible to conclude that Trump being president actually has little to do with the DJIA exceeding 20,000 points, rather, one can really attribute it to the day-to-day volatility of the DJIA!

Advertisements

The Relationship Between The Electoral College and Popular Vote

An interesting machine learning problem: Can one figure out the relationship between the popular vote margin, voter turnout, and the percentage of electoral college votes a candidate wins? Going back to the election of John Quincy Adams, the raw data looks like this:

Electoral College Party Popular vote  Margin (%)

Turnout

Percentage of EC

John Quincy Adams D.-R. -0.1044 0.27 0.3218
Andrew Jackson Dem. 0.1225 0.58 0.68
Andrew Jackson Dem. 0.1781 0.55 0.7657
Martin Van Buren Dem. 0.14 0.58 0.5782
William Henry Harrison Whig 0.0605 0.80 0.7959
James Polk Dem. 0.0145 0.79 0.6182
Zachary Taylor Whig 0.0479 0.73 0.5621
Franklin Pierce Dem. 0.0695 0.70 0.8581
James Buchanan Dem. 0.12 0.79 0.5878
Abraham Lincoln Rep. 0.1013 0.81 0.5941
Abraham Lincoln Rep. 0.1008 0.74 0.9099
Ulysses Grant Rep. 0.0532 0.78 0.7279
Ulysses Grant Rep. 0.12 0.71 0.8195
Rutherford Hayes Rep. -0.03 0.82 0.5014
James Garfield Rep. 0.0009 0.79 0.5799
Grover Cleveland Dem. 0.0057 0.78 0.5461
Benjamin Harrison Rep. -0.0083 0.79 0.58
Grover Cleveland Dem. 0.0301 0.75 0.6239
William McKinley Rep. 0.0431 0.79 0.6063
William McKinley Rep. 0.0612 0.73 0.6532
Theodore Roosevelt Rep. 0.1883 0.65 0.7059
William Taft Rep. 0.0853 0.65 0.6646
Woodrow Wilson Dem. 0.1444 0.59 0.8192
Woodrow Wilson Dem. 0.0312 0.62 0.5217
Warren Harding Rep. 0.2617 0.49 0.7608
Calvin Coolidge Rep. 0.2522 0.49 0.7194
Herbert Hoover Rep. 0.1741 0.57 0.8362
Franklin Roosevelt Dem. 0.1776 0.57 0.8889
Franklin Roosevelt Dem. 0.2426 0.61 0.9849
Franklin Roosevelt Dem. 0.0996 0.63 0.8456
Franklin Roosevelt Dem. 0.08 0.56 0.8136
Harry Truman Dem. 0.0448 0.53 0.5706
Dwight Eisenhower Rep. 0.1085 0.63 0.8324
Dwight Eisenhower Rep. 0.15 0.61 0.8606
John Kennedy Dem. 0.0017 0.6277 0.5642
Lyndon Johnson Dem. 0.2258 0.6192 0.9033
Richard Nixon Rep. 0.01 0.6084 0.5595
Richard Nixon Rep. 0.2315 0.5521 0.9665
Jimmy Carter Dem. 0.0206 0.5355 0.55
Ronald Reagan Rep. 0.0974 0.5256 0.9089
Ronald Reagan Rep. 0.1821 0.5311 0.9758
George H. W. Bush Rep. 0.0772 0.5015 0.7918
Bill Clinton Dem. 0.0556 0.5523 0.6877
Bill Clinton Dem. 0.0851 0.4908 0.7045
George W. Bush Rep. -0.0051 0.51 0.5037
George W. Bush Rep. 0.0246 0.5527 0.5316
Barack Obama Dem. 0.0727 0.5823 0.6784
Barack Obama Dem. 0.0386 0.5487 0.6171

Clearly, the percentage of electoral college votes a candidate depends nonlinearly on the voter turnout percentage and popular vote margin (%) as this non-parametric regression shows:

electoralmap.png

We therefore chose to perform a nonlinear regression using neural networks, for which our structure was:

nnetplot

As is turns out, this simple neural network structure with one hidden layer gave the lowest test error, which was 0.002496419 in this case.

Now, looking at the most recent national polls for the upcoming election, we see that Hillary Clinton has a 6.1% lead in the popular vote. Our neural network model then predicts the following:

Simulation Popular Vote Margin Percentage of Voter Turnout Predicted Percentage of Electoral College Votes (+/- 0.04996417)
1 0.061 0.30 0.6607371
2 0.061 0.35 0.6647464
3 0.061 0.40 0.6687115
4 0.061 0.45 0.6726314
5 0.061 0.50 0.6765048
6 0.061 0.55 0.6803307
7 0.061 0.60 0.6841083
8 0.061 0.65 0.6878366
9 0.061 0.70 0.6915149
10 0.061 0.75 0.6951424

One sees that even for an extremely low voter turnout (30%), at this point Hillary Clinton can expect to win the Electoral College by a margin of 61.078% to 71.07013%, or 328 to 382 electoral college votes. Therefore, what seems like a relatively small lead in the popular vote (6.1%) translates according to this neural network model into a large margin of victory in the electoral college.

One can see that the predicted percentage of electoral college votes really depends on popular vote margin and voter turnout. For example, if we reduce the popular vote margin to 1%, the results are less promising for the leading candidate:

Pop.Vote Margin Voter Turnout % E.C. % Win E.C% Win Best Case E.C.% Win Worst Case
0.01 0.30 0.5182854 0.4675000 0.5690708
0.01 0.35 0.5244157 0.4736303 0.5752011
0.01 0.40 0.5305820 0.4797967 0.5813674
0.01 0.45 0.5367790 0.4859937 0.5875644
0.01 0.50 0.5430013 0.4922160 0.5937867
0.01 0.55 0.5492434 0.4984580 0.6000287
0.01 0.60 0.5554995 0.5047141 0.6062849
0.01 0.65 0.5617642 0.5109788 0.6125496
0.01 0.70 0.5680317 0.5172463 0.6188171
0.01 0.75 0.5742963 0.5235109 0.6250817

One sees that if the popular vote margin is just 1% for the leading candidate, that candidate is not in the clear unless the popular vote exceeds 60%.

 

2016 Real-Time Election Predictions

Further to my original post on using physics to predict the outcome of the 2016 US Presidential elections, I have now written a cloud-based app using the powerful Wolfram Cloud to pull the most recent polling data on the web from The HuffPost Pollster, which “tracks thousands of public polls to give you the latest data on elections, political opinions and more”.  This app works in real-time and applies my PDE-solver / machine learning based algorithm to predict the probability of a candidate winning a state assuming the election is held tomorrow.

The app can be accessed by clicking the image below: (Note: If you obtain some type of server error, it means Wolfram’s server is busy, a refresh usually works. Also, results are only computed for states for which there exists reliable polling data. )

 

Will Donald Trump’s Proposed Immigration Policies Curb Terrorism in The US?

In recent days, Donald Trump proposed yet another iteration of his immigration policy which is focused on “Keeping America Safe” as part of his plan to “Make America Great Again!”. In this latest iteration, in addition to suspending visas from countries with terrorist ties, he is also proposing introducing an ideological test for those entering the US. As you can see in the BBC article, he is also fond of holding up bar graphs of showing the number of refugees entering the US over a period of time, and somehow relates that to terrorist activities in the US, or at least, insinuates it.

Let’s look at the facts behind these proposals using the available data from 2005-2014. Specifically, we analyzed:

  1. The number of terrorist incidents per year from 2005-2014 from here (The Global Terrorism Database maintained by The University of Maryland)
  2. The Department of Homeland Security Yearbook of Immigration Statistics, available here . Specifically, we looked at Persons Obtaining Lawful Permanent Resident Status by Region and Country of Birth (2005-2014) and Refugee Arrivals by Region and Country of Nationality (2005-2014).

Given these datasets, we focused on countries/regions labeled as terrorist safe havens and state sponsors of terror based on the criteria outlined here .

We found the following.

First, looking at naturalized citizens, these computations yielded:

Country

Correlations

Percent of Variance Explained 

Afghanistan

0.61169

0.37416

Egypt

0.26597

0.07074

Indonesia

-0.66011

0.43574

Iran

-0.31944

0.10204

Iraq

0.26692

0.07125

Lebanon

-0.35645

0.12706

Libya

0.59748

0.35698

Malaysia

0.39481

0.15587

Mali

0.20195

0.04079

Pakistan

0.00513

0.00003

Phillipines

-0.79093

0.62557

Somalia

-0.40675

0.16544

Syria

0.62556

0.39132

Yemen

-0.11707

0.01371

In graphical form:

The highest correlations are 0.62556 and 0.61669 from Syria and Afghanistan respectively. The highest anti-correlations were from Indonesia and The Phillipines at -0.66011 and -0.79093 respectively. Certainly, none of the correlations exceed 0.65, which indicates that there could be some relationship between the number of naturalized citizens from these particular countries and the number of terrorist incidents, but, it is nowhere near conclusive. Further, looking at Syria, we see that the percentage of variance explained / coefficient of determination is 0.39132, which means that only about 39% of the variation in the number of terrorist incidents can be predicted from the relationship between where a naturalized citizen is born and the number of terrorist incidents in The United States.

Second, looking at refugees, these computations yielded:

Country

Correlations

Percent of Variance Explained

Afghanistan

0.59836

0.35803

Egypt

0.66657

0.44432

Iran

-0.29401

0.08644

Iraq

0.49295

0.24300

Pakistan

0.60343

0.36413

Somalia

0.14914

0.02224

Syria

0.56384

0.31792

Yemen

-0.35438

0.12558

Other

0.54109

0.29278

In graphical form:

We see that the highest correlations are from Egypt (0.6657), Pakistan (0.60343), and Afghanistan (0.59836). This indicates there is some mild correlation between refugees from these countries and the number of terrorist incidents in The United States, but it is nowhere near conclusive. Further, the coefficients of determination from Egypt and Syria are 0.44432 and 0.31792 respectively. This means that in the case of Syrian refugees for example, only 31.792% of the variation in terrorist incidents in the United States can be predicted from the relationship between a refugee’s country of origin and the number of terrorist incidents in The United States.

In conclusion, it is therefore unlikely that Donald Trump’s proposals would do anything to significantly curb the number of terrorist incidents in The United States. Further, repeatedly showing pictures like this:

at his rallies is doing nothing to address the issue at hand and is perhaps only serving as yet another fear tactic as has become all too common in his campaign thus far.

(Thanks to Hargun Singh Kohli, Honours B.A., LL.B. for the initial data mining and processing of the various datasets listed above.)

Note, further to the results of this article, I was recently made aware of this excellent article from The WSJ, which I have summarized below:

Some Thoughts on The US GDP

Here are some thoughts on the US GDP based on some data I’ve been looking at recently, mostly motivated by some Donald Trump supporters that have been criticizing President Obama’s record on the GDP and the economy. 

First, analyzing the real GDP’s average growth per year, we obtain that (based on a least squares regression analysis)

According to these calculations, President Clinton’s economic policies led to the best average GDP growth rate at $436 Billion / year. President Reagan and President Obama have almost identical average GDP growth rates in the neighbourhood of $320 Billion / year. However, an obvious caveat is that President Obama’s GDP record is still missing two years of data, so I will need to revisit these calculations in two years! Also, it should be noted that, historically, the US GDP has grown at an average of about $184 Billion / year. 

The second point I wanted to address is several Trump supporters who keep comparing the average real GDP annual percentage change between President Reagan and President Obama. Although they are citing the averages, they are not mentioning the standard deviations! Computing these we find that:


Looking at these calculations, we find that Presidents Clinton and Obama had the most stable growth in year-to-year real GDP %. Presidents Bush and Reagan had highly unstable GDP growth, with President Bush’s being far worse than President Reagan’s. Further, Trump supporters and most Republicans seem quick to point out the mean of 3.637% figure associated with President Reagan, but the point is this is +/- 2.55%, which indicates high volatility in the GDP under President Reagan, which has not been the case under President Obama. 

Another observation I would like to point out is that very few people have been mentioning the fact that the annual real US GDP % is in fact correlated to that of other countries. Based on data from the World Bank, one can compute the following correlations: 


One sees that the correlation between the annual growth % of the US real GDP and Canada is 0.826, while for Estonia and The UK is roughly close to 0.7. Therefore, evidently, any President that claims that his policies will increase the GDP, is not being truthful, since, it is quite likely that these numbers also depend on those for other countries, which, I am not entirely  convinced a US President has complete control over!

My final observation is with respect to the quarterly GDP numbers. There are some articles that I have seen in recent days in addition to several television segments in which Trump supporters are continuously citing how better Reagan’s quarterly GDP numbers were compared to Obama’s. We now show that in actuality this is not the case. 

The problem is that most of the “analysts” are just looking at the raw data, which on its face value actually doesn’t tell you much, since, as expected, fluctuates. Below, we analyze the quarterly GDP% data during the tenure of both Presidents Reagan and Obama, from 1982-1988 and 2010-2016 respectively, comparing data from the same length of time. 

For Reagan, we obtain: 


For Obama, we obtain:


The only way to reasonably compare these two data sets is to analyze the rate at which the GDP % has increased in time. Since the data is nonlinear in time, this means we must calculate the derivatives at instants of time / each quarter. We first performed cubic spline interpolation to fit curves to these data sets, which gave extremely good results: 


We then numerically computed the derivative of these curves at each quarter and obtained: 

The dashed curves in the above plot are plots of the derivatives of each curve at each quarter. In terms of numbers, these were found to be: 


Summarizing the table above in graphical format, we obtain: 


As can be calculated easily, Obama has higher GDP quarterly growth numbers for 15/26 (57.69%) quarters. Therefore, even looking at the quarterly real GDP numbers, overall, President Obama outperforms President Reagan. 

Thanks to Hargun Singh Kohli, B.A. Honours, LL.B. for the data collection and processing part of this analysis.