Volatility Futures & Options: Machine Learning Model Validation

I just came across an excellent and highly relevant piece of research "A comparison of machine learning model validation schemes for non-stationary time series data" by Matthias Schnaubelt. Features like non-stationarity, concept drift, and structural breaks present serious modelling challenges, and properly validating ML time series models requires knowing proper validation strategies.

Dr Schnaubelt writes:

Using cross-validation for time-series applications comes at a great risk. While theoretically applicable, we find that random cross-validation often is associated with the largest bias and variance when compared to all other validation schemes. In most cases,blocked variants of cross-validation have a similar or better performance, and should therefore be preferred if cross-validation is to be used. If global stationarity is perturbed by non-periodic changes in autoregression coefficients, we find that forward-validation may be preferred over cross-validation.Within forward-validation schemes, we find that rolling-origin and growing-window schemes often achieve the best performance. A closer look on the effect of the perturbation strength reveals that there exist three performance regimes: For small perturbations, cross- and forward-validation methods perform similarly. For intermediate perturbation strengths, forward-validation performs better. For still higher perturbation strengths, last-block validation performs best.

While some of this I intuited and used in practice, it is great that someone has thoroughly looked and analyzed the topic. If you are using machine learning in finance and want to weigh in with your experience with these or other validation strategies, please leave a comment.

Volatility Futures & Options

Machine Learning Model Validation

No comments:

Post a Comment

Weekly market report