Categorizing my Work
This past week, I have been delving deeper into my research on the topics of word embeddings and trading algorithms. My research revealed many new things (some negative, some positive, but I consider all lessons learned to be positive) to me, which I will describe in greater detail later on. The first point I wanted to share is how I plan on structuring my work, as I currently have (what I believe to be) a sufficient amount of background information to get meaningful work done on the coding front.
As mentioned in previous posts, there are two primary components to my research work:
- Word2Vec for creating embeddings
- Trading Algorithm strategies
I will be working on these two components in parallel throughout the year, simultaneously. This is preferable to working on single topics for large blocks of time, as working simultaneously between Word2Vec and trading algorithms helps with finding useful correlations. For instance, last week I read a super interesting article on zerohedge.com which pointed out a recurring trend happening during China’s Golden-Week festival.
Interesting Applications for Neural Networks
Every year, Chinese people recognize the Golden Festival, a week-long celebration welcoming in the new Lunar year. During this time, the Chinese stock exchanges close, as people travel home for festivities. Interestingly enough, writers are Zerohedge noticed that every year, during this week, prices of precious metals would fall considerably, only to increase sharply as soon as the Chinese markets reopen.
Here is a visualization, with all pictures taken from here:
To test this theory, I went and looked at recent gold prices. China’s Golden Week of 2018 began October 2nd, and went to October 9.
Not surprisingly, gold prices lost about 22$ to the ounce during this week, but have jumped 40$ to the ounce since then.
The reason why I mention this example is because it would be a perfect place to implement neural networks. This is a clearly recurring pattern which occurs during the same time period (China’s Golden week happens on the same date every year, but what about holidays/events that change dates year-to-year), and a recurrent neural network should be able to figure this out on its own. I’m thinking of using this data as a entry-point into incorporating AI into my trading algorithm, as it is a clearly defined task: Train the network on this data (+ other unrelated gold-price data to make the set more varied), and ask it to predict what will happen to gold prices starting October 1, 2019.
Questions to Consider
This may sound simple, but there are a lot of factors which come into play. For instance, how would we incorporate times (dates) into the neural network? What’s the purpose of having a neural network recognize a financial pattern, without knowing when it would appear? What exact data would we use for training the network? Gold prices & dates are necessary, but how would we convert these to inputs for a RNN? Would it make sense to use Word2Vec for this?
These are all questions that I need to start experimenting with, and trying to answer on my own. I’ve tried researching the answers online, but there is little to no consensus on answers, as people genuinely don’t know what the best approach is (and those who DO know definitely won’t share with the rest of the world). As I’ve come to realize through my research on trading algorithms in general, there are infinitely many different approaches you can take, and equally as many trading strategies.
Some other interesting trading strategies that I’ve researched are:
Rather basic investing strategy where we assume worse performing stocks one week will perform best next week, and vice-versa.
I am currently writing a sample algorithm of this strategy using quantopian.com’s python IDE.
2. Sentiment analysis
The theory that one can use investor sentiment (calculated using NLP neural networks applied on Twitter/Stocktwits data) to make accurate predictions about which stocks will go up/down (positive sentiment –> BUY, negative sentiment –> Sell).
The Word2Vec search I’m currently writing will hopefully be able to do some basic sentiment analysis.
3. Selling/buying based on large-cap hedge fund involvement
Apparently, a super effective strategy in recent years has been to buy / sell stocks based on how under/overweight they are. An underweight stock is a stock which has a large short positioning by large-scale hedge funds relative to the stock’s ‘weight’ (influence) in the S&P500. An overweight stock is just the opposite. It appears that buying underweight stocks while shorting overweight stocks is a strong investing strategy, which has consistently been returning positive post-2008 crash. One doesn’t really need a neural network to do this, but I thought it was worth noting.
But, there are many more investing strategies to consider. The main question is which ones would be most applicable to the Word2Vec I want to implement?
Reflecting back on the week
I think my strategy of implementing Word2Vec-created embeddings into a trading algorithm has potential, but only I can figure out if this is true or not. This type of ambiguous work is hard for me to grasp right now, as all of my questions have no real answers — One must figure them out for himself (every man for himself type situation). But, I don’t want to get bogged down in the research, so I will try to choose one definite strategy (fundamental dataset) in this upcoming week so I can get to tangible work.
There is always more learning to be done, and I will continue to research online, through books (I’m about halfway through Black Swan by Nicolas Taleb — I will write a reflection post once I’m done, as there are some super interesting parallels between my current work and what Mr. Taleb writes about) and through interviews.