Our article today will deal with the issue of historical data. It is a rather extensive topic that concerns not only stock pair traders, but all traders in general – from stock to futures, options, and forex. Poor data can have a significant impact on back-test results and ultimately result in unpleasant losses.
Sources of historical stock data
Historical stock data is provided by a number of web servers, brokers, and specialised companies. Free sources include Google Finance, Yahoo Finance, MSN, or directly stock exchange websites. Paid sources include EODdata.com, IQFeed, eSignal, TradeStation. Historical data is usually also provided by the broker who maintains your trading account.
Specifics of historical stock data
Historical stock data is specific in that it may be (and usually has been) influenced by splits and dividends. Let us see what they are:
The dividend payment and split dates can be found in public sources, such as at www.google.com/finance. Splits and dividends are usually marked in a price graph.
Impact of dividend on price
Dividend payments influences stock prices in two ways:
- a. The price of the stock grows before the record date (the day as at which a shareholder is entitled to dividend payment) because investors are interested in holding a long position in order to get a dividend payment. On the other hand, traders in a short position try to end their position so that they would not have to pay out any dividend. There is a short-term excess of demand over supply. That is for us, as stock pair traders, an excellent opportunity to enter the position.
- Some data providers adjust historical stock prices such as to reflect income from the dividend paid. In that case, the dividend amount is linearly distributed throughout the period between two dividend payments. This adjusted price is designated as “Adjusted” and it does not correspond to the actual price at which trading took place in the past. It is a “virtual” price that is good for back-testing ‘BuyAndHold’ type strategies.
Data for backtests
For back-testing any trading strategy, we need data that best corresponds to reality. That is, data that best simulates the reality of the subsequent live trading. For us as stock pair traders, both of the phenomena referred to above are important.
- Price split will appear in the computational model as gap, jump = extreme move in the calculation model, generating an extreme RelStDev value, and hence also a false signal for entry. The calculation model is rendered useless by a split and it cannot be used for Period days. After Period days, the split “disappears” from the processed data (from sliding averages) and the back test can continue.
StockPairBuilder features a built-in mechanism that watches for splits in data and automatically filters out transactions influenced by splits. That means that even uncleansed data containing splits can be used for back testing as the Builder knows how to deal with it.
Still, it is better to work with data that has balanced splits, because of correlation calculations, as they are significantly influenced by any split and rendered useless!
Types of data from providers
Different historical data providers take a different approach to the phenomenon of splits and dividends. In practice, all combinations of data treatment can be seen:
- Adjusted splits, adjusted dividends
- Adjusted splits, unadjusted dividends
- Unadjusted splits, adjusted dividends
- Unadjusted splits, unadjusted dividends
Trustworthy data providers publish on their website a description of the method whereby they cleanse and adjust their data. They are not, unfortunately, always 100% thorough and sometimes not all splits / dividends are cleansed. It can happen that one split is balanced whereas another not. These problems surprisingly occur even in data from paid services. Generally, it can be said that frequently traded stock with a regular or higher price and a high volume have their data better cleansed and reliable than “penny stock” – inexpensive stock titles with a low volume.
The StockPairTrading package includes the StockPairTrading_DataEditor program. It is a tool that allows for fast and clear comparisons of up to three independent sources of data. With it, you can create your own, tested database of historical data. We will pay more attention to the program in one of our future articles.
In today’s article, we discussed in detail the issue of historical stock prices. It is a very (!) important aspect of back-testing. Due attention must be paid to source selection and to subsequent validation of data. A back-test carried out with poor quality data is of no value. It can even be misleading. Stock whose data is significantly influenced by splits or whose data is incomplete, is not trustworthy, and it is therefore better to eliminate it from the database altogether, thereby preventing a distortion of the back-test results.
Petr Tmej a Petr Slepička
Previous chapter: Live trading of stock pairs