Saturday, 31 March 2012

Back-Testing: Advice and Caveats for the Beginner

In this article, I look at the dos-and-don'ts of back-testing. Now we have our 'scraped' price data, we are in a position to develop trading strategies to apply to it. As previously mentioned, I tend to test new strategies initially in Metastock and then take the formulae/indicators therein and build them into my back-testing application as new methods.

There are a number of basic caveats to back-testing that you need to be aware of and some I have come a real cropper on:
  •  First, and potentially most dangerous, is allowing your system 'see' future data - by this I mean you must not let your 'tests' access any data that could be potentially in the future. This can be really subtle and difficult to debug and not especially obvious. The best way to get around this problem is to be really disciplined in your coding and to isolate data based on age. I do this by creating 'subsets' of data with Linq using the 'current' date and the earliest date in your back-testing model. As the 'subset' is a Linq table object and contains all the price data I need: High, Low, Opening, Close and Volume, all I subsequently need to do is separate out the bits I need such as a date array and a closing price array.
  • Second, and a potential danger to us working with 'free' data, is Survivor Bias. This is where stocks have dropped out of the indices over the years for performance reasons and therefore your initial dataset is already biased in favour of 'survivors'. There is no easy way round this, unless you want to fork-out for a 'clean' dataset - I believe that if your testing is thorough enough and your sample sizes are large enough, then this will not necessarily be a problem.
  • Third, and most important, avoid 'curve fitting'. By this I mean that if you add countless parameters to your model, you should not be surprised if you get excellent returns in back-testing. The art of model development is definitely 'less is more' - you should aim to reduce your parameters to an absolute minimum, that way your model will perform in the widest ranges and types of market. The sign of a good model is how few, and how simple the parameters are. You should aim to continually test and reduce your parameters until you see no observable change in your results. This is a hard point to make, but crucial, especially for us amateurs. I suggest you read Ernie Chan's opinions on simplifying trading models either in his blog or in his book.
  • Fourth, compounding. I made this mistake for a while, my back-testing model would use the returns of previous trades to fund future ones. This looks great and does help you to see the effects of compounding, however it does not help in testing or verifying the success or otherwise of your model. You need to strip-out such effects from your initial model testing so that you are testing only the veracity or otherwise of your parameters rather than the vagaries of market timing.
Finally, make sure to look out for the obvious. If your model starts to perform really well in back-testing, assume you have something wrong. Take you model apart and test it bit-by-bit. In the early days of my model building I realised I was feeding historical data into the model in reverse! Compartmentalise your development and test each piece in isolation - strip-out all of your parameters and then add them in one-by-one to see their impacts or otherwise on your model.

Oh, and make sure you back-up and carefully document each stage of the model building. I use source control to help me do this - this is invaluable in going back and looking at what you have or haven't done in the past.


  1. Very interesting blog I must say! I'm currently writing a platform for systematic trading in C# and was wondering if you ever tried to implement a portfolio handler (to track trades and performance)? If so, how would you go about it?