Does Technical Analysis Work? Here’s Proof! (Part 2)
Posted by Mark on April 9, 2021 at 07:04 | Last modified: March 10, 2021 13:33Today I continue with commentary and analysis of Janny Kul’s TDS article with the same title.
Kul explains p-hacking:
> If we run multiple permutations over and over and we just stop when
> we reach one that looks favourable, this lands us in a situation
> statisticians call p-hacking.
>
> Much like a series of coin tosses, there is a chance, however small,
> that we continually land on heads.
Indeed, I have now learned about how to run Bonferroni and Šidák corrections for multiple comparisons in Python.
Kul continues by saying we need a better test for comparison to avoid what could be a mirage of significance caused by multiple comparisons. One possibility is to compare with a buy-and-hold group, but:
> The problem… is that some instruments are inflationary (like Gold
> and Stocks…) and some aren’t (like USD — in an inflationary
> environment the dollar would likely depreciate).
>
> This isn’t a fair test because if a technical indicator is… right 51% of
> the time, we may be able to reasonably deduce there’s Alpha, but
> if we compare it against stock, well we’d expect stocks to be positive
> more than 51% of the time given the economy grows over time
> (historically on a daily basis the S&P 500 is +ve 55% of the time).
Kul is essentially claiming inflation to be a confounding variable (see fourth paragraph here) when looking for alpha. I don’t know that I agree. One internet source states long-term historical inflation to be ~3.2%. Regardless of the exact number, it’s positive and it happens most years. Any TA strategy that does not exceed this is not worth trading, in my opinion, regardless of whether inflation actually boosts baseline buy-and-hold performance.*
For whatever reason, stocks generally melt higher to such a degree that most long equity strategies I studied outperformed over the long-term. I believe (not yet studied) real estate melts higher. Gold seems to melt higher, but my studies did not show consistent outperformance. Contrary to Kul’s inflation hypothesis, I found oil—a commodity priced in USD—to face increased headwinds when traded long (see third paragraph here). This may be due to a particular 4-year time interval of oil prices, though: I need to look at a longer-term chart for verification.
Kul then goes on to say a better approach is the (in Python parlance) train_test_split method, which is to say use IS and OOS periods for comparison:
> [Acceptable performance would be] over 50% right in both the train
> period and test period (i.e. do both produce positive P&L) or we
> require some arbitrary threshold like 0.8x of the outperformance
> from the train period to conclude whether a particular indicator
> “works” or not.
>
> The easiest way… to test this is… to run a simulation of every
> indicator (x4) on every instrument (x10) for, say, the first 6 months
> of 2018 so that’s 40 P&L scenario’s across x3 charts. Then we take
> the top 10 best performing combinations (or we could even take all
> of the ones that have produced positive P&L) and run them for
> another 3 months then look at the performance.
I think this is all legit, but the true brilliance come next.
>
* — For starters, one way to study this would be to look for differences in annual stock
returns between inflationary versus deflationary years.