Option FanaticOptions, stock, futures, and system trading, backtesting, money management, and much more!

What’s the Problem with Walk-Forward Optimization?

I discussed Walk-Forward Optimization (WFO) with regard to trading system development in the fifth paragraph here. My testing thus far has left me somewhat skeptical about the whole WF concept.

I wrote a mini-series about WFO many years ago and explained how it fits into the whole system development paradigm (see here). WFO has many supporters and has been called “the gold standard of trading system validation.”

I have found WFO to be a very high hurdle to clear. I was especially frustrated because multiple times, an expanded feasibility test (i.e. second example here as opposed to seventh paragraph here) passed whereas WFO generated poor results. WFO is basically taking trades at different times from different standard optimizations, which as a whole did pretty well (thereby passing expanded feasibility). How could the entire sequence end up losing money, then?

The easy explanation is different pass criteria for feasibility and WFO. In the feasibility phase, I merely require profitability. The TradeStation criteria for passing WFO phase are:


Although the particular numbers may be changed, this should give a good idea of what a viable strategy might look like: consistently profitable, no huge drawdowns, and relatively short periods of time in between new equity highs.

These criteria are much more stringent than feasibility’s “X% iterations profitable.” This explanation should have satisfied me.

Due to my mounting frustration, however, I couldn’t help but start to rationalize why WFO might be unnecessary for a viable trading system. Here are my thoughts from a few months ago:

     > …aside from generating OS data, which I agree is essential, I think WF
     > screens for an additional characteristic that may not be necessary for
     > real-time profitability. People talk about how managers and asset classes
     > that are the best (worst) during one period end up worse (better) in
     > subsequent periods. WF would reject such mean-reverting strategies due
     > to poor OS performance. Each manager or asset class may be okay to trade,
     > though, as one component of a diversified, noncorrelated portfolio despite
     > the phenomenon of mean reversion… this trainability, for which WFO
     > screens, being altogether unnecessary.

I think it’s an interesting argument: one that can only be settled by sufficient testing.

What’s the alternative without WFO? Probably an expanded feasibility test followed by Monte Carlo simulation.

At this point, I have no practical reason to reject the notion of WFO especially keeping in mind that I may have been conducting the WFO altogether wrong with the coarse grid (see last paragraph here).