Dr Schnaubelt writes:
Using cross-validation for time-series applications comes at a great risk. While theoretically applicable, we find that random cross-validation often is associated with the largest bias and variance when compared to all other validation schemes. In most cases,blocked variants of cross-validation have a similar or better performance, and should therefore be preferred if cross-validation is to be used. If global stationarity is perturbed by non-periodic changes in autoregression coefficients, we find that forward-validation may be preferred over cross-validation.Within forward-validation schemes, we find that rolling-origin and growing-window schemes often achieve the best performance. A closer look on the effect of the perturbation strength reveals that there exist three performance regimes: For small perturbations, cross- and forward-validation methods perform similarly. For intermediate perturbation strengths, forward-validation performs better. For still higher perturbation strengths, last-block validation performs best.
While some of this I intuited and used in practice, it is great that someone has thoroughly looked and analyzed the topic. If you are using machine learning in finance and want to weigh in with your experience with these or other validation strategies, please leave a comment.
No comments:
Post a Comment