Testing the Randomized OOS (Part 4)
Posted by Mark on February 7, 2019 at 07:33 | Last modified: June 15, 2020 10:50I ran into a snafu last time in trying to think through validation of Randomized OOS. Today, let’s try to get back to basics.
The argument for Randomized OOS seems strong as a test of OOS robustness for different market environments. By analyzing where the original backtest fits within the simulated distribution (DV #1), I should be able to get a sense of how fair the OOS period is and whether it contributes positively or negatively to OOS results. Also, if all simulations are above zero (DV #2) then I feel more confident this strategy is likely to be profitable during the time period studied.
In the same breath, Randomized OOS is a reflection of IS results. The better IS performance, the greater the chance for better scores on DV #1 and DV #2. I could look at the total equity curve and separately evaluate IS vs. OOS, but I think the stress test may portray this more clearly.
To make for a viable strategy, I want the actual backtested OOS equity to be in the lower 2/3 of the simulated distribution and a Yes on DV #2. I also want to see decent IS performance, but the latter is probably redundant if I am looking at Randomized OOS. My study, then, is to determine whether strategies that pass Randomized OOS are more likely to go on to produce profitable results in the future (similar to this third paragraph).
Perhaps the highest-level study I can do with the software is to build the best strategies and see what percentage proceed to do well* afterward. Since the software builds strategies based on IS results, I could save time by testing on IS and looking to see what percentage of best strategies do well OOS. This could serve as a benchmark for what percentage of best strategies that also meet stress testing criteria go on to do well. The big challenge is to find strategies that pass the stress tests. This is also the most time-consuming activity.
The latter process, though, is probably already shortened now that I have rejected the Noise Test. Related future studies include exploration of the merits of MC simulation and MC drawdown.
* — Operational definition required. “Do well” could mean positive PNL or
some minimal score on other fitness functions.