Sep 17, 2012

Backtesting ETFs

Institutional Investor magazine recently published an article titled "Study Finds Many ETF Indexes Misleading". For me and for other readers of this blog I'm sure this hits home, especially in terms of performance of VIX-related ETFs. Over the last two years that I wrote about several ETFs and ETNs and tried to provide some sort of guidance on their future performance based on past data. While some were relatively easy to reconstruct and explain, other products were more complicated.

For example, in Jan 2011 I posted my analysis of XVIX with quite optimistic projections. The performance since publication over the last year and a half was disappointing -8%, with maximum drawdown of ~22%. While the balancing rule for the ETF was simple: -0.5 * VXX  + VXZ, rebalanced daily, after all things taken into account (including 0.85% fee from the issuer) the volatility risk premium, or term structure trade simply did not work out.

Another forecast I made turned out better - XIV and other daily inverse ETFs performed quite well in the last 1.5 years - XIV rose 71% , although with 49% drawdown between March 2012 and June 2012.

Predicting future is fundamentally challenging, and even strategies that seem robust do not always perform as expected. Having said that I want to bring another point: there are different trading styles: the ones mentioned above, and others like Mebane Faber's GTAA ETFs (that also so far has disappointed in its performance) are rule-based. My intuition tells me that these type of strategies are most susceptible to data mining bias / overfitting. Strategies that are based on valuation, and particularly on relative valuation seem to be more robust to model error.  However I have no idea how to quantify either one. 


  1. Andrew F1/18/2013

    I agree that GTAA's performance has been disappointing. I'm not sure that underperformance is do to data mining, though. Faber's system seems highly generalizable, and he's applied it to out of sample data with reasonable success.

    I suspect that its recent underperformance is due to the volatile sideways market we've been in for the last 2 or so years. Maybe trending markets were an anomaly that we won't see in the future, but I doubt it.

  2. Good point Andrew - 2 years may be too short of a time period to judge low-frequency trading model. But I want to highlight the difference in robustness of different strategies - any thoughts on that?