Deep Learning the Stock Market, from the Perspective of a HFT Developer

A couple of days ago, I spoke with Tom Conerly, an alum of my current high school and current developer at secretive trading research company Jump Trading. As part of my research project in Word2Vec for market predictions, I want to speak with people with actual working experience in the fields of quantitative trading, developing trading algorithms, or general asset management. I am still looking for more people to talk to, so if you have any suggestions or references, please let me know! 
Anyways, back to the interview. Since trading research companies work on the premise of having slight strategical advantages over their competitors, I didn’t expect to get any concrete findings about Jump Trading’s algorithms. Rather, I wanted to get a general idea of how Jump Trading works, and whether or not they have faith in using deep learning for making predictions. Here are the questions and answers from my interview, with answers being paraphrased since I wasn’t allowed to record the conversation: 
S.A: What is life truly like as a newcomer in a trading firm? Is it actually like the horror stories people write about?
T.C:  Jump Trading is divided into separate, small trading teams which all work on unique trading strategies. Hence, the hours aren’t crazy, and it comes off more as a research job, with not that much variance. Investment banking is definitely different, as it is more of a high-stress environment involving watching real-time stock prices and trades go through. This is where one could see more hazing, working people hard, but not so much at a research company like Jump Trading. 
S.A: Working at a company focused on algorithmic trading strategies, is there a stronger emphasis on mathematical modeling, such as statistical probabilities, or on deep learning strategies when creating your algorithms?
T.C: Again, Jump Trading is not a bunch of traders watching in real time, or using software to make trades. There is less ‘human’ involvement in the trading [we] do, as the people work on developing and testing the most efficient, profitable HFT strategies. In this development, there is definitely a strong emphasis on mathematical modeling, but AI is used, only smaller models such as linear regression. With Deep Learning, it ends up being a continuous a trade-off between how powerful the model is, and the negative impacts of stock market noise. This is why in HFT, Deep Learning is uncommon. 
S.A: What’s your opinion on using deep learning to recognize patterns in the behavior of markets? Do you think this could be a profitable strategy? Some say the behavior of the markets don’t follow easily-recognizable patterns, others say they work in identical cycles. 
T.C: [My] intuition is this is not the right direction to go, again because of the fact that the stock market tends to be incredibly noisy. If you are trying to trade high frequency, which means holding assets often for only less than a minute, deep learning won’t be particularly helpful. However, Deep Learning could be a viable hedge-fund strategy, as hedge-funds tend to work on a more macro scale and hold assets for much longer periods of time. There definitely could be a quant trading company doing this right now. 
S.A: What’s one factor you guys consider most heavily when gauging the value of an asset (say: stock, bond, future, currency)?
T.C: When [we] look at an asset, [we] don’t take into account revenue, profits, or general fundamental analysis, as [we] believe all of these factors are reflected by the prices of the assets. The analysis done is more directed towards the real-time state of the market, and one primary way of doing this is by analyzing book pressure (how number of buyers at a given point in time compares to the number of sellers, and vice versa). But, this is not to say this is the only way to gauge the value of assets, as there are many different ways to focus on trading. In [our] case though, the focus is on smaller profits but in larger quantities. 
S.A: What’s the most important factor to consider when creating a trading algorithm?
T.C: It really depends what data you’re building your trading algorithm on. For [us], like I said, book pressure, recent vs. expected prices, are important, while not paying too much attention to what comes out of company earning reports
S.A: How much do market cycles influence your firm’s trading strategies?
T.C: They basically don’t. There are some trading conditions which might affect you, but it really is more of a small-scale approach for [us]. Market cycles, macro trends are more hedge-fund oriented, where investors go long/short for bigger bets on longer time frames. Whereas, in trading over a shorter time period, you have more valuable data to train on. For example, if you are trying to use data to make long-term predictions, you don’t really have that much online data spanning pre-1990s, which limits the scope and ability you have to train your neural net. 
But, on the other hand, hedge-funds can take all kinds of approaches toward making investments, giving them more varied sources of data. 

Related image

What can we take away from this? 

Though high frequency trading differs in many ways from using deep learning in financial pattern recognition, it is interesting (and definitely valuable) to hear from these different perspectives. This interview aligns well with my previous post, where I talk about the potential shortcomings of deep learning for trading algorithms. Mr. Conerly seems to agree with the idea that it isn’t profitable to replace human trading experts with powerful deep learning tools, such as Word2Vec. But, on the other side, Mr. Conerly did acknowledge that some secretive hedge/quant funds may very well be using these types of AI tools for their projections. He adds that HFT uses a lot more data than macro-analysis, since they are analyzing prices on a second-to-second basis (more like milliseconds). This makes me think of the pros/cons of testing my Word2Vec net on macro-scale data vs. micro-scale data. Micro-scale analysis, according to Mr. Conerly, would give you more data to work with (and a more accurate neural net as a result), but the predictions would be less significant and meaningful since they’re tailored to a smaller scale. Macro-scale analysis would yield bigger, more significant predictions (larger margin for error, bigger margin for profit) but would presumably have less data available for training. 
Another thing I found really interesting and applicable to my work is Mr. Conerly’s point on how they [developers at Jump Trading] don’t take fundamental data, or time, into account when investing. His point was that fundamental data doesn’t need to be analyzed because this data is reflected accurately through the stock’s price (efficient market theory), and that specific time intervals are irrelevant when predicting if a stock will go up or down. Disregarding fundamental data (earnings, P/E ratios, company debt, ROE, etc…) and time intervals in a training set would save enormous amounts of computational power, money, and time, so this would be ideal. But, this brings along another question, which is how efficient is the efficient market theory? 
These questions that are starting to emerge are more subjective and uncertain than the ones I asked in my last post, so again, I am not anticipating clear answers. Once again, it is seeming as if these are questions I’ll have to answer on my own, through my research. 
Looking forward, it is now clear that I need to find some actual data sets and start training, predicting (back testing for now), and taking notes of the results. I will need to develop a more explicit research plan, which I will post sometime this week. 

The Shortcomings of Neural Networks for Trading Predictions

As someone who is devoting a large-portion of their senior year (and very likely time beyond that) to researching potential applications of deep learning in trading, I wasn’t thrilled to learn about the recent shortcomings of quantitative traders. Let’s begin with Marcos López de Prado, a frequently cited algorithmic trader who recently  published Advances in Financial Machine Learning. One thing that De Prado talks about is the idea of ‘red-herring patterns’ that are extrapolated by machine learning algorithms. These types of algorithms are, by design, created to analyze large bodies of data and identify patterns within this data. In fact, this idea of noticing patterns is one of the main assumptions I am basing my work on (using Word2Vec embeddings to identify past financial patterns and apply them to real-time data for more accurate predictions). But, what happens when these algorithms identify patterns that aren’t real? An aggressive neural network (In my case: One which adjusts vector weights heavily while learning from data) is prone to make these types of mistakes. Think of this example: A stock happens to go up a couple percent points every Thursday for three weeks in a row. A (poorly written) neural network would deduce that every Thursday in the future, this stock would go up by at least a percent point or two. Now, this is easily avoidable by training a trading algorithm on larger sets of data, but even large data sets are prone to these types of red-herrings. Once a trading algorithm clings on to a pattern, it could backfire horribly when that pattern eventually breaks.
Black-Swan-900
This brings the idea of Black Swans into light. The theory of Black Swans was popularized by Nasim Taleb in his accurately-titled book The Black Swan: The Impact of the Highly Improbable. The general gist of this theory is that the most profoundly impactful events oftentimes are the ones we least expect, due to our fallacious tendencies in analyzing statistics (I will go into more detail on these topics and more in a future blog post, once I am done reading the whole book). Taleb argues that one of our biggest shortcomings in analyzing data is creating ‘false narratives’, which are more convenient and easier to sell to clients. These false narratives oftentimes omit crucial data (silent data), which backfires once the narrative breaks.
But, on the other end, a more passive neural network (one which more slightly adjusts vector weights) can sometimes come to no meaningful conclusions, which means wasted time and computational energy. I want to create a Word2Vec model which can detect patterns, but I also don’t want it to actively follow patterns with no longevity.
So, what does one do? How aggressive/passive should I make my Word2Vec neural network? 

Another theory which I encountered over the weekend is the idea of survivorship bias. In training neural networks, how do we treat data from companies which have failed? If we are analyzing the stock price data for various important stocks over time, what do we with data from once-important stocks which are now defunct, such as Lehman Brothers? I initially thought it would be best to throw this data out, since it is no longer applicable, but it turns out this strategy can have negative consequences. If we only train our network on stocks which have survived, then we will miss out on crucial data about when stocks go bankrupt. So, how do we properly treat this type of data?

lehman

All of these seemingly insignificant flaws in trading algorithms can evoke catastrophic mistakes. This concept is synthesized by quantitative investment officer Nigol Koulajian, saying: “You can have one little pindrop that can basically make you lose over 20 years of returns.” This ‘little pindrop’ which Koulajian mentions is the eventual divergence from the false patterns identified by neural networks. I personally think it would take more than a little pindrop to erase 20 years of returns, but the idea still stands. So, this warrants the question, how do we avoid the little pindrop? My (far-fetched?) theory is that you can use neural networks to estimate worst-case scenarios int the same way they are designed to estimate best-case scenarios, and then work to avoid this.
In broader terms, Bloomberg reports that the Eureka Hedge Fund Index, which tracks the returns of hedge funds which are known for using machine learning, has under performed yearly compared to the S&P 500. The harsh truth (right now) is that simply investing in the S&P500 will return ~13% yearly, while machine-learning based hedge funds return ~9% yearly.
Eureka Hedge Fund Index
(The keen observer will notice that despite all the noise, the index has been steadily going up over the past 7 years)
These are some of the questions I ask those few who read what I am writing, and are the types of questions I will ask through my personal research interviews (Good News! I have my first interview scheduled this upcoming Tuesday, and, interviewee permitting, I will post a summary of our talk later in the week).
In my personal opinion, the recent under performance of trading algorithms in general is not a bad sign. This is still a relatively new field, meaning that more research needs to be done and new discoveries need to be made. I think of it this way: If trading algorithms are working perfectly, then what’s the point of a newcomer (like me) coming in and doing research on them? If it ain’t broke, don’t fix it.

Significance of Gold’s Golden Week + Brainstorming Investment Strategies

Categorizing my Work

This past week, I have been delving deeper into my research on the topics of word embeddings and trading algorithms. My research revealed many new things (some negative, some positive, but I consider all lessons learned to be positive) to me, which I will describe in greater detail later on. The first point I wanted to share is how I plan on structuring my work, as I currently have (what I believe to be) a sufficient amount of background information to get meaningful work done on the coding front.
convergence
As mentioned in previous posts, there are two primary components to my research work:

  1. Word2Vec for creating embeddings
  2. Trading Algorithm strategies

I will be working on these two components in parallel throughout the year, simultaneously. This is preferable to working on single topics for large blocks of time, as working simultaneously between Word2Vec and trading algorithms helps with finding useful correlations. For instance, last week I read a super interesting article on zerohedge.com which pointed out a recurring trend happening during China’s Golden-Week festival.

Interesting Applications for Neural Networks

Every year, Chinese people recognize the Golden Festival, a week-long celebration welcoming in the new Lunar year. During this time, the Chinese stock exchanges close, as people travel home for festivities. Interestingly enough, writers are Zerohedge noticed that every year, during this week, prices of precious metals would fall considerably, only to increase sharply as soon as the Chinese markets reopen.
Here is a visualization, with all pictures taken from here:

To test this theory, I went and looked at recent gold prices. China’s Golden Week of 2018 began October 2nd, and went to October 9.
Golden Week 2018
Not surprisingly, gold prices lost about 22$ to the ounce during this week, but have jumped 40$ to the ounce since then.
The reason why I mention this example is because it would be a perfect place to implement neural networks. This is a clearly recurring pattern which occurs during the same time period (China’s Golden week happens on the same date every year, but what about holidays/events that change dates year-to-year), and a recurrent neural network should be able to figure this out on its own. I’m thinking of using this data as a entry-point into incorporating AI into my trading algorithm, as it is a clearly defined task: Train the network on this data (+ other unrelated gold-price data to make the set more varied), and ask it to predict what will happen to gold prices starting October 1, 2019.

Questions to Consider

This may sound simple, but there are a lot of factors which come into play. For instance, how would we incorporate times (dates) into the neural network? What’s the purpose of having a neural network recognize a financial pattern, without knowing when it would appear? What exact data would we use for training the network? Gold prices & dates are necessary, but how would we convert these to inputs for a RNN? Would it make sense to use Word2Vec for this?
These are all questions that I need to start experimenting with, and trying to answer on my own. I’ve tried researching the answers online, but there is little to no consensus on answers, as people genuinely don’t know what the best approach is (and those who DO know definitely won’t share with the rest of the world). As I’ve come to realize through my research on trading algorithms in general, there are infinitely many different approaches you can take, and equally as many trading strategies.
Some other interesting trading strategies that I’ve researched are:
1. Mean-Reversion
Rather basic investing strategy where we assume worse performing stocks one week will perform best next week, and vice-versa.
I am currently writing a sample algorithm of this strategy using quantopian.com’s python IDE.
2. Sentiment analysis 
The theory that one can use investor sentiment (calculated using NLP neural networks applied on Twitter/Stocktwits data) to make accurate predictions about which stocks will go up/down (positive sentiment –> BUY, negative sentiment –> Sell).
The Word2Vec search I’m currently writing will hopefully be able to do some basic sentiment analysis.
3. Selling/buying based on large-cap hedge fund involvement
Apparently, a super effective strategy in recent years has been to buy / sell stocks based on how under/overweight they are. An underweight stock is a stock which has a large short positioning by large-scale hedge funds relative to the stock’s ‘weight’ (influence) in the S&P500. An overweight stock is just the opposite. It appears that buying underweight stocks while shorting overweight stocks is a strong investing strategy, which has consistently been returning positive post-2008 crash. One doesn’t really need a neural network to do this, but I thought it was worth noting.
But, there are many more investing strategies to consider. The main question is which ones would be most applicable to the Word2Vec I want to implement?

Reflecting back on the week

I think my strategy of implementing Word2Vec-created embeddings into a trading algorithm has potential, but only I can figure out if this is true or not. This type of ambiguous work is hard for me to grasp right now, as all of my questions have no real answers — One must figure them out for himself (every man for himself type situation). But, I don’t want to get bogged down in the research, so I will try to choose one definite strategy (fundamental dataset) in this upcoming week so I can get to tangible work.
There is always more learning to be done, and I will continue to research online, through books (I’m about halfway through Black Swan by Nicolas Taleb — I will write a reflection post once I’m done, as there are some super interesting parallels between my current work and what Mr. Taleb writes about) and through interviews.
 
 

sloanreview.mit.edu/content/uploads/2018/01/SU-Kiron-Data-Sustainability-1200-300×300.jpg

 

Drawing Inspiration from Market2Vec + Outline of my Original Trading Algorithm

Image result for market 2 vec
Recently, I stumbled upon an article written by Tal Perry, founder of lighttag.io, which detailed his attempt to use Word2Vec on market data. He called it Market2Vec, and the basis behind his work was to create stock embeddings, rather than word embeddings, using the Word2Vec framework. The data he implemented was open/close & high/low prices for 1000 different stocks, and he transferred this data to create a single 300-dimensional input vector. (*Right off the bat, I notice that I need to look into how one would condense a vector with thousands of dimensions into one with <=300 dimensions, which is the largest acceptable input vector size for Word2Vec models). Using these condensed input vectors, he was able to pass them through a recurrent neural network which outputs the probability of future activity of these stocks. (There is a lot more at work going on here and I am simply brushing over it to get his idea across — I would recommend reading his article for the details.) This ‘probability’ needs to be defined, and can follow pretty much any definition tailored to the data you want to model. For example, you could train the neural network to output a ‘1’ if a certain index goes up 1% in a certain amount of time, and a ‘0’ if it doesn’t. Then, the neural network, using the data you provide it, would return the probability of a 1 or a 0 happening in a certain context. This is incredibly useful information, as the work done by Mr.Perry can be modeled using almost any input vector, and can be used to predict changes in any data set (say, the S&P500 rather than the VIX, which is what he tried to predict).
Knowing this, I have a general idea of the buildings blocks I will base my original trading algorithm on.

  1. Market2vec (Fundamental data set)
  2. NLP and investor sentiment (Partner data set, Word2Vec)
  3. Connecting trends, cycles (Partner data set, Word2Vec)

Part 1 is essentially what I just talked about above, which is using fundamental data sets (such as opening/closing prices for hundreds of stocks) to make certain predictions generated by Word2Vec embeddings. This is my current focus, and what I am working on right now.
For part 2, I have heard mixed opinions on the effectiveness of analyzing investor sentiment to predict changes in stock price. For example, one comment I received is that sentiment oftentimes comes after significant shifts in market prices, at which point the data is meaningless since it comes in too late. Also, the amount of data and statistical analysis required to reach accurate predictions based on investor sentiment is huge, and the calculations wouldn’t be fast enough in real-time to invest appropriately. However, I do think it’s an interesting topic to explore, and is worth trying. One thought I had is to focus on figuring out sentiment preceding a company’s earnings release, rather than aimlessly analyzing sentiment on an undefined spectrum (which would require more data and would be less meaningful). If the calculated sentiment in regards to an earnings release is overwhelmingly positive, it would make sense to buy stock of the company before the earnings are announced, and vice-versa.
Part 3 is a more ambitious goal of mine, which I will define in greater detail in future posts.
 
Image result for z score
A final thing which I have been learning about are z-scores, which came up while studying a generic mean-reversion algorithm (this is a simple trading algorithm which takes the highest/lowest performing stocks from last week, and predicts that in the upcoming week, they will perform opposite as they did last week. So, low-performing stocks last week will perform well this week, and vice-versa). The idea of z-scores is to simplify changes in stock prices, and make this data more meaningful. For example, if Amazon’s stock goes up 5$, it would be inconsequential, as 5$ is such a small percentage of Amazon’s stock price. However, if a penny stock goes up 5$ in a day, it would be the best day of someone’s life, giving unprecedented returns. Same goes with volatility: Z-scores have to take the regular volatility of a stock into account. If a stock whose price rarely shifts goes up a lot in a day, it would mean a lot more than an increase in a stock whose price is prone to volatility.
To make data more uniform and easier to work with, we calculate z-scores, which take both volatility and percentages into account, by subtracting mean from the raw score then dividing by the standard deviation. This is an incredibly useful tool to know about, as it allows us to compare price changes across all kinds of assets.

http://www.lifestyletrading101.com/word2vec-deep-learning/
https://mathbitsnotebook.com/Algebra2/Statistics/normalstandard.jpg

 
 
 
 

Word2Vec for improving Trading Algorithms: Initial Overview

NeuralNet Map (0)
Word2Vec is a public programming model which allows users to create shallow neural networks. Word2Vec gets its name (Words –> Vectors) from the fact that these neural networks can be trained on large data sets, and eventually create unique, weighted, multi-dimensional vectors for each word in the data set. The weights attributed to each unique vector are established by analyzing the context in which these words appear (given a window size), and adjusting this vector’s weights based on the vector weights of the words which it appears by.
In the image above, you can see a plot graph of the words found in a certain data set. (I ran the program for creating this plot graph, but the data is not mine, nor is it the data I want to use for my project. I solely used this as an initial example). The neural network created on this data only used 10,000 steps, which is not much at all. The idea of neural networks is that the more you train them, on more and more data, the more accurate and meaningful the results are. Despite this small step size, you can still see accurate word connections in the graph above. For example, take a look at the light-blue dot on the far-left which is labelled ‘island’. Right next to it (the nearest dot should be the word most closely associated) is the purple dot labelled ‘sea’. Island is not a synonym of sea, but those words are related by associated meaning.
This brings me to my bigger point: Word2Vec can be used not only to find related words, but also to identify bigger connections. What this leads me to believe is that Word2Vec can be used in trading algorithms, by analyzing current events/trends and seeing if they map to previously identified ones. Since we know the outcome of previous events, we can use this knowledge to predict what might happen as this current event’s outcome. Interestingly enough, this idea has been experimented with before, as I was able to find 2 (only two, because Word2Vec became available to the public after 2014 and therefore is relatively new technology) papers looking at Word2Vec for trading algorithms. One was written by Tal Perry and the other by Alexandr Honchar. Reading their research was insightful, and I would definitely want to interview at least one of them to learn more in-depth about their approaches and experiences.
This revelation gives me a clear understanding of the course I will be taking this year, in my independent research work. Below is the general back-end plan I have for my work:

  1. Write Word2Vec search
  2. Improve search by adding more focused datasets. 
  3. Test my Word2Vec search for its capability for noticing trends
  4. Study trading algorithms
  5. Implement Word2Vec, Black Swan principle, hedging → algos
  6. See how non-Word2Vec trading algorithms compare to standard trading algorithms.

So far, I am well into part 1 of my plan, as I have successfully written a search algorithm which uses Word2Vec to find passages, words, events relating to a given query. The user provides a query and number of desired results, and my program returns the closest vectors to this query, ordered by relevancy. One interesting thing I have observed through my first round of testing is that oftentimes, results lower in relevancy are strikingly useful, and usually are way more relevant than results ranked higher up than it. This could mean that I need to rework some of my code, or just test the neural net on a broader data set. My intuition is that the latter is true, so I am currently working on finding useful training datasets.  So far, I have been looking at Wikipedia articles, Quora posts, Investopedia pages, etc…
 
Wikipedia Database
 
I am happy with how things are going on the programming end of things, but I still haven’t secured any interviews, which I don’t want to delay any further. My list of potential interviewees has gotten up to about 17, with my goal being 25+.  If anyone knows of people working in original trading algorithms or Word2Vec AI who would be willing to talk for a few minutes (or even just answer some questions over email), please let me know in the comments below.
 

Finding Interviews + Rough CS Plan for the Year

 
As of now, the most pressing matter for my Capstone Project is to set-up interviews for the year. The purpose of these interviews is to get insights into topics pertaining to trading algorithms/Word2Vec-type programming. I have organized my potential interviewees into three main themes, each theme being a different perspective on the work I plan on doing. Below is the list of the first group of people I have found:
1. People with close experience to actual trading, who can give a perspective into this field. Ex-stock traders, financiers,
 

2. Upper-management level investors, ex-fund retirement fund managers, people who oversee portfolios, angel investors, investment bankers.
 
 

  • Jim Seidman, retired investment adviser, software executive.
  • Federico M Dominguez Garcia Diego, founded his own AI-based hedge fund in London, has been working with automated trading since 1998. Recommends that hedge funds are better off without human traders, as AI is more precise, fast, efficient, and only improves over time. (https://www.linkedin.com/in/fededominguez?trk=author_mini-profile_title)
  • Lauren Bernut, https://www.quora.com/profile/Laurent-Bernut. Ex-Hedge Fund analyst, current algorithmic trader, also shares the belief that machines are much preferable over human traders.   

3. People with a more technical perspective, quants, people who have written/have experience with trading algorithms, programmers, mathematicians.
 
 

  • Borislav Agapiev, specialist in Word2Vec, computer science, search engines.
  • Michael Halls-Moore, ex-quantitative researcher, now runs blog about research findings, advice, tutorials, etc…
  • Nikola Bozinovic, tech entrepreneur from Serbia who founded Frame, a cloud software which was recently acquired. He was responsible for most of the coding/framework in creating his company, so this would be an ideal person to speak with.  

 
Of course, this list is not exhaustive, and my goal is to get at least 20-25 people on this list (organized by theme) by the end of the week (9/30/18), and have all my outreach to them done by then as well.
I have been exploring potential questions for the interviews, as well. 

  1. Human traders vs. trading algorithms? Do you think human traders will have a place in the future, or are algorithms and technologies too far superior? 
  2. Mathematical prowess vs. technological advantages? Which is most important in trading, writing trading algorithms?
  3. How do you personally judge the value of assets, positions? What are the most important factors you look at when analyzing?
  4. What advice would you give to a younger person who wants to work in this field? Best/worst decisions? Possible setbacks to avoid? 

I will have to alter these questions in some way from person to person, and will add in some more specific questions based on who agrees to participate.
On the computer science end of things, I have continued work and research on my Word2Vec search engine. On a larger scale, I have written-out a general plan of action for my independent computer science work this year, which goes as follows:

  1. Write Word2Vec search
  2. Improve data by adding more datasets
  3. Test capability for noticing trends
  4. Study trading algorithms
  5. Implement Word2Vec → algos
  6. See how non-Word2Vec trading algorithms compare to standard trading algorithms.

I will go into more detail on numbers 5 & 6 in my next post, but as can be seen at the top of my list, my biggest priority right now (aside from scheduling interviews) is creating a solid Word2Vec word embedding dataset. Once this is done and tested, I can move into the trading algorithm portion of my work, where I will bring everything (my interview findings, my stock research, my newfound knowledge about financial instruments, and discoveries from the books I am reading) together.

(scrumology.com/wp-content/uploads/2011/11/NBG.jpg)

Researching trading through books (9/19/18)

A couple of days ago, I finished reading Michael Lewis’s book Boomerang. I have always been a fan of Lewis’s work, mainly due to how he packages complex topics (such as the subprime mortgage crisis of 2008 in his book The Big Short) into captivating, easy-to-follow stories. The focus of Boomerang was looking at various banking strategies across the world, comparing/contrasting these approaches, and analyzing the effects they have had on their respective countries. What I found especially interesting is how Lewis ties these banking strategies to the history of these countries, explaining how national banks’ behaviors when financing is easily-attainable says a lot about the mindsets and cultures of the people residing who reside there. I was fascinated by Lewis’s explanation of how German banks allowed themselves to be screwed over by the subprime bonds sold from Wall Street, with Lewis blaming the German tendency to play by the rules without considering potential corruption underlying these rules. Lewis writes: “At bottom, he says, the Germans were blind to the possibility that the Americans were playing the game by something than the official rules. The Germans took the rules at their face value: they looked into the history of triple-A-rated bonds and accepted the official story that triple-A-rated bonds were completely risk-free.”(163).
The reason why I include this quote is because, prior to reading Lewis’s book, I didn’t realize the subprime mortgage market was created to dupe overseas banks as well. In my previous research on this subject, I came to the conclusion that subprime mortgages were devised as a method of massively increasing the volume of mortgages in the U.S, knowing that these loans would never be repaid and that the common person would have to pay for the inevitable damages. This really altered my view on finances, and international banking in general.
Towards the denouement of the book, Lewis flips the script by drawing parallels between the shortcomings of foreign banking strategies and the local finances in the United States. Lewis mentions the city of Vallejo, California, which went bankrupt in 2008, with 80% of its debts “…wrapped up in the pay and benefits of public safety workers.”(201). The workers unions for public safety workers, such as police officers and firefighters, had extorted the city government to receive maximum possible benefits and salaries. When the city either refused/was unable to meet these requests due to insolvency, the public safety workers simply left and moved to another city, where the repeated the process. Incredibly, the majority of cities in California, the most financially significant state in the U.S (GDP of nearly $2.5 Trillion), were in deep debt, mainly because of this same reason (along with countless other problems as well). This made me think about the future direction California can head in — Either the state changes its policies to prevent extortion from such workers unions, or it continues indebting itself and inflating its GDP in the process.
The reason why I mention all of this is because I have decided that reading finance-related books over my Senior year would be a great way to extend my Capstone project, while also learning more about the field that interests me. One could say that reading various books, which may seem extraneous to my overarching goal of programming my own trading algorithm, sounds pointless. However, I think there is a genuine value to this, as I can learn new perspectives, theories, and even investment opportunities through such books.
I have compiled a short list of books I want to read throughout the year:

  1. The Black Swan, by Nicholas Nassim Taleb.
  2. The Intelligent Investor, by Benjamin Graham
  3. A Template for Understanding Big Debt Crises, by Ray Dalio
  4. Quantitative Trading, by Ernie Chan
(https://images-na.ssl-images-amazon.com/images/I/71X3Y9yRtlL.jpg)

Ray Dalio’s Debt-Driven Business Cycle (9/13/2018)

These past few days, I’ve continued research on various different themes. One super interesting recent article I found was on CNBC, which mentions an interview with hedge-fund manager Ray Dalio (potential interviewee, maybe…). Dalio, an outspoken believer in cyclic models of economic trends, mentions that we are currently in the “7th inning” of the economic cycle, suggesting that debt levels are nearing a ceiling and will eventually lead to a burst in our current financial bubble. Another interesting finding I got through this was Dalio’s general guide to the debt-driven business cycles:

  • The Early Part of the Cycle
  • The Bubble
  • The Top
  • The Depression
  • The Beautiful Deleveraging
  • Pushing on a String/Normalization

What Dalio is saying is that our bubble has grown past normal levels, and we are getting closer to the so-called “Top”, which can only last for so long.
A second interesting thing I stumbled upon this week was Renaissance Technologies, a self-proclaimed “quantitative investment management company trading in global financial markets, dedicated to producing exceptional returns for its investors by strictly adhering to mathematical and statistical methods.” The reason why I mention this company is because it is another hedge-fund, but its trading algorithms rely solely on statistical and mathematical models. Consequently, Renaissance is one of the highest-performing hedge funds in history, whose exclusive Medallion Fund has generated an astounding 72% average return every year since it was established in 1993. Even more fascinating to me is that RenTech’s Medallion Fund generated 98% profits in 2008, the year when stocks across the world famously plummeted and the U.S entered the Great Recession. Furthermore, the fact that RenTech relies solely on mathematical models implemented through trading algorithms interests me, as this is what I want to do in my independent CompSci project on the second half of the year. I will definitely keep researching this, along with Ray Dalio’s Bridgewater Associates (largest hedge fund in the world) throughout the year.
In terms of specific goals for these upcoming days, I have decided that I need, more than anything, a concrete plan laid out for this semester (and the whole year). I have found in the past that I work much more effectively when I first create a plan to follow, so this is a habit I want to uphold this year too. These next days, I will try to make a plan and will run it by Dave for approval, and will move forward from there.
 

Starting work on trading algorithm research, Word2Vec (9/11/18)

To the dismay of many, school has officially begun, but I am in a surprisingly good mood right now. I have started work on my computer science independent project, whose focus has been diverted a bit from my original plan. Rather than beginning work on writing my own trading algorithm, I have decided to extend work on an independent project I began over the summer, which involves using Word2Vec word embeddings to create a small-scale search engine. In layman’s terms,  Word2Vec is a machine learning model which creates vectors for all words, creating vectors of similar magnitude for words with similar meanings. This is really interesting, as it shows computers can deduce meanings of words based solely on the context of the data which you provide it. The data which I am currently using is Q&A data from finance forums online, so the results I will be getting are obviously going to be tilted towards financial terms. What I mean by this is the Word2Vec model I will be running won’t be able to define words (or word pairs) such as “banana” or “Puerto Rico”, but should be able to define words such as “dotcom bubble” or “inflation”.
In an ideal world, my queries would yield full passages, examples, or historical examples, so that when the user searches “dotcom bubble”, they could see a map of passages relating to this query, ranked in how closely they associate to the query. One criticism which I’ve received for this is that such a program already exists, and it’s called Google. But, Google does not use Word2Vec or vector mapping as its primary means of search. Google has a specific PageRank algorithm, which has proved tremendously successful, but does have its share of flaws. What I’m trying to do with my small-scale search engine is see how my results compare with search results given by Google. In addition to this, I will learn more about cause/effect relationships in finance along the way, serving as a great segue into my second independent project: writing my own trading algorithm.
In terms of stocks and finance , the SPDR ETF is up slightly (<+1%), continuing the upwards trend we saw in the markets beginning mid-July 2018. At SPDR’s peak on August 29, we saw a rise of +7.60% from July 3. My initial intuition was that this sudden, sharp rise at the end of the summer would be counteracted by a sharp decline in September, as has happened many times before in the past (see: September Effect). There has been some decline, as I expected, but not nearly as much as I anticipated, and definitely not enough to nullify the gains made July-August. In my honest opinion, I expect to see the SPDR maintaining steady losses through September, especially with the impending arrival of winter and category 4.5 Hurricane Florence, which is expected to make landfall sometime this week.  
The reason why I choose to follow the SPDR ETF, as a starting point, is because this fund is designed to track the progress of the S&P 500, perhaps the most important American stock-trading index. The S&P is comprised of the 500 most commonly traded public companies in the U.S, including the likes of Facebook, Amazon, Apple, and Google. The combined market cap of the S&P 500 is estimated around $24 Trillion. 
 

The Arguments Behind America’s Legalized Robbery System

 
In recent times, it has began to seem as if law officials in modern-day America have been enjoying a wide array of benefits and freedom which has allowed them to act in a fashion which many consider to be nefarious and detrimental to the citizens of the nation. It seems as if a news report covering a case of unjustified police brutality emerges every week, and the topic has become so common with journalists that the New York Times has a section on its website dedicated solely to articles concerning injudicious acts committed by law officials. Some experts, including Cedric Alexander, the DeKalb County, Georgia, deputy COO of public safety and president of the National Organization of Black Law Enforcement Executives (2), argue that police brutality has not, in fact, become more prevalent over the years, but the public believes so due to the virality of cell-phone videos released by eye-witnesses and similar videos that are covering media pages. However, there is still one controversial law which, in the past few years, has made it increasingly difficult for law enforcement officials to ameliorate the way American citizens perceive them.
On April 25th of the year 2000 (1), a bill titled Civil Asset Forfeiture Reform Act was enacted. Despite its mundane name, Civil Forfeiture is actually an extremely intriguing and provocative operation which has caught the attention of citizens from around the nation. Upon passing, the bill has allowed police officers and other law officials to seize currency or other assets from civilians without having to accuse them of having committed a crime. In essence, this is a procedure which gives Police the right to seize large sums of money from citizens who, in the majority of cases, are completely innocent and blameless. It is no wonder why civil forfeiture has been described as “…legalized robbery by Law Enforcement” by Ezekiel Edwards, the director of the ACLU Criminal Law Reform Project. Naturally, the practice has been controversial for the entirety of its existence, but has recently become the victim of worldwide scrutiny after after an episode of Last Week Tonight aired to several million viewers and went viral after being posted online.
As if matters weren’t already bad enough, according to John Oliver, the majority of assets seized by police officers are kept by the police department in which they reside. What this entails is that a highway patrol sergeant could pull over a vehicle, proceed to interrogate the driver with questions regarding currency in his possession, conclude by taking all of the driver’s money and handing it over to his police station and justify all of these actions by claiming that “there was significant reason to believe that the money could have been used to commit crimes in the future.” In many situations, innocent, law-abiding citizens can have their entire lives ruined by such incidents. According to an article in the New Yorker (3), a couple with two young sons were on a road trip towards Linden, Texas, when they were abruptly pulled over by a local sheriff. After talking to them for a short while, the officer asked the parents if there were any drugs being transported in the vehicle, to which the two answered “no.” The officer was not convinced by this answer, so he and his partner concluded that they would search the car. Although their search was fruitless, the two sheriffs took the couple and their children to the local police station, where they were told that their money would have to be confiscated and handed over to the station because the couple “fit the profile of drug couriers,” and that children were purposefully placed in the car to act as “decoys.” The only way the couple could have gotten their money back would have been to go to trial, which would have costed them an additional, considerable amount of money and time, which the two did not have.
On top of this mountain of secretive, seemingly nefarious police activity lies one last component which makes the entire rest of the matter appear to be even more unseemly. According to several different sources, including an article on Business Insider (5) and one on the Washington Post (6), police agencies are free to use the majority of the money they seize during civil forfeiture raids. Although they are supposed to use this money wisely to purchase necessary items which can not be acquired within budget provided, insufficient amounts of regulations have lead to very many questionable purchases made with seized assets. For example, cops serving in Amarillo, Texas, police force used $637 to obtain a coffee machine for their department offices. In another case, a police department spent an astonishing $5300 to purchase challenge coin medallions for those working there. There have been other instances where more reasonable and justifiable purchases were made, such as the acquisition of a $227,000 armored personnel carrier.
Now, every story has two sides, and although I may have made civil forfeiture seem like a completely unnecessary and awful practice which only serves to harm, but it turns out that asset forfeiture was put into place with many good intentions in mind. Additionally, those who are in favor of keeping the bill have used powerful arguments to support their claims. To provide you with an insight into the rationale that pro Civil Forfeiture members possess, here are a few of the many arguments they give that show the positive aspects of legalized asset forfeiture. Stefan D. Cassella, a proclaimed asset forfeiture and money laundering expert (7), states that Civil Forfeiture makes it easier for government and police officials to seize unlawful items from criminals, claiming “With forfeiture laws, we can separate the criminal from his profits.” This is a reasonable argument, as it is true that criminals should not be allowed to keep possession of items that they have used to commit heinous acts. Mr. Cassella continues by declaring that “forfeiture undeniably provides both a deterrent against crime and as a measure of punishment for the criminal.” He justifies this claim by advocating that “Many criminals fear the loss of their vacation homes, fancy cars, businesses and bloated bank accounts far more than the prospect of a jail sentence.” In most cases, this is in fact true. A large majority of criminals perform illegal acts to obtain large sums of money easily, which they then use to purchase expensive items. If criminals knew that there was a large chance that their goods would be taken from them, they would likely be discouraged from committing crimes in the future.
Another instance showcasing the thought process of a civilian who is strongly in favor of keeping civil forfeiture laws is on a page on the popcenter.com (an abbreiviation of ‘problem-oriented policing’) website(8). The author, whose name is not stated on the website, has arguments similar to those of Mr. Stefan Cassella. He/She begins by stating that asset forfeiture plays a major role in crime reduction by deterring criminals who plan on breaking the law sometime in their lives. According to the writer, asset forfeiture is advantageous “ Because incarceration (or the threat of such) does not deter all offenders. Forfeiture is intended to pick up where traditional punishments leave off.” The creator of the article progresses by claiming that the legalization of civil forfeiture has led to an increase in drug-related arrests, which is beneficial to everybody because an increase in drug-arrests should result in a decrease in drug-related crimes. To conclude his/her argument, the author declares that a third, often overlooked advantage of asset forfeiture is its positive impact on the budgets of police officers. The author phrases this by saying that “The obvious advantage of asset forfeiture is its potential to boost an agency’s bottom line.” It is true that in many cases, law officials and police officers are denied access to valuable equipment due to a lack of funds, and money obtained through asset forfeiture can be extremely helpful in solving this problem.
After witnessing both the benefits and consequences of keeping civil forfeiture laws in America, it becomes clear why an abundance of people are in favor of the practice and why so many others are vehemently opposed to it. From my 15 year old perspective, it appears that the Civil Asset Forfeiture Reform Act was created and enacted in the hope of rendering the lives of all American citizens safer, with the dream of creating more sophisticated law enforcement tactics which involve stopping crimes before they happen rather than waiting until it is too late. Nearly 16 years after the bill was written, the goal of civil forfeiture is very much still relevant, and the practice has an opportunity to become one which can truly benefit our nation. However, the entire meaning of civil forfeiture is being destroyed by two major issues: An overabundance of power accompanied by an insufficient amount of supervision. If we are going to give police officers around the country the ability to take money from potentially innocent people before accusing them of having committed a crime, then we must have strict regulations being enforced by directors who oversee the actions taken by these officers. An example of this would be having someone in charge of verifying all of the purchases made by police officers who are using money that was seized during an asset forfeiture case. This would prevent officers from buying ridiculous and unnecessary items using the life-savings of a potentially innocent citizen. Another example of what could be done would be have a specially trained agent in every police department whose job is to approve every single asset forfeiture case committed by an officer working in his department. This would help ensure that every time money was being seized from someone, there would have to be a good reason to do so.
In the majority of instances, people who are suddenly provided with large amounts of power and control start to make problems out of nothing just so that they can enforce their strength, which eventually leads to them causing trouble for innocent citizens. This is why civil forfeiture has become such a controversial practice, because police officers put in these kinds of situations without proper oversight are bound to make mistakes which could end up devastating the lives of those caught on the wrong end of things. As the wise man Abraham Maslow once said: “When you have a hammer, everything starts to look like a nail.”

Sources Cited:
1.“H.R. 1658 (106th): Civil Asset Forfeiture Reform Act of 2000.”Www.govtrack.us. Civic Impulse, LLC, 2004. Web. 10 Dec. 2015. <https://www.govtrack.us/congress/bills/106/hr1658>.
 
2.McLaughlin, Eliott. “We’re Not Seeing More Police Shootings, Just More News Coverage.” Www.cnn.com. Cable News Network. Turner Broadcasting System, Inc., 21 Apr. 2015. Web. 11 Dec. 2015. <http://www.cnn.com/2015/04/20/us/police-brutality-video-social-media-attitudes/>.
 
3.Stillman, Sarah. “Taken – The New Yorker.” The New Yorker. 12 Aug. 2013. Web. 11 Dec. 2015. <http://www.newyorker.com/magazine/2013/08/12/taken>.
 
4.Campbell, Chris. “The American Nightmare That Is Civil Asset Forfeiture.” Laissez Faire The American Nightmare That Is Civil Asset Forfeiture Comments. 20 June 2012. Web. 11 Dec. 2015. <http://lfb.org/the-american-nightmare-that-is-civil-asset-forfeiture/>.
 
5.Fuchs, Erin. “Here Are The Ridiculous Things Cops Bought With Cash ‘Seized’ From Americans.” Business Insider. Business Insider, Inc, 14 Oct. 2014. Web. 11 Dec. 2015. <http://www.businessinsider.com/heres-what-police-bought-with-civil-forfeiture-2014-10>.
 
6.O’Harrow Robert O’Harrow Jr., S, Robert. “Asset Seizures Fuel Police Spending.” Washington Post. The Washington Post, 11 Oct. 2014. Web. 11 Dec. 2015. <http://www.washingtonpost.com/sf/investigative/2014/10/11/asset-seizures-fuel-police-spending/>.
 
7.Cassella, Stefan. “Forfeiture Is Reasonable, and It Works.” : Publications : The Federalist Society. The Federalist Society, 1 May 1997. Web. 11 Dec. 2015. <http://www.fed-soc.org/publications/detail/forfeiture-is-reasonable-and-it-works>.
 
8.“Benefits of Forfeiture.” Center for Problem-Oriented Policing. Center for Problem-Oriented Policing, 2015. Web. 11 Dec. 2015. <http://www.popcenter.org/responses/asset_forfeiture/5>.