Would it be possible to implement Time Series K-Fold Cross-Validation?

For example:

With a Time Series Training/Validation Interval which goes from 2017-01-01 to 2019-12-31:

With a regularly spaced interval of 1 [month] step.

For example, a regular K-Fold Cross-Validation is trained between 2017-01-01 and 2017-02-01, and error within that Time Frame is minimised (Eg: RMSE). However, in order to evaluate by ourselves the error, we use out-of-sample validation data, which goes from 2017-02-01 to 2017-03-01.

The process is repeated iteratively:

A regular Cross-Validation is trained between 2017-01-01 and 2017-03-01, and error within that Time Frame is minimised (Eg: RMSE). However, in order to evaluate by ourselves the error, we use out-of-sample validation data, which goes from 2017-03-01 to 2017-04-01.

And so successively.

Some theoretical context:

https://robjhyndman.com/hyndsight/tscv/

https://www.sciencedirect.com/science/article/abs/pii/S0167947317302384

Some scikit-learn context:

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html

How would be used?:

If you are generating a point estimate you could calculate t+1, and use t+1 target prediction mean to update the features, and to calculate t+1 back again, and so successively, for a reasonable time window. This would require calculated variables, which “Update” with Target Mean Predictions, by considering it as already known data.

Another use case (Which requires Time Series Cross-Validation + Group-Wise Cross-Validation):

We would have 2 Time axis:

Our Date Axis: 2019-01-01, 2019-01-02, 2019-01-03, …, 2019-12-31.

Our Snapshot Date Axis (Linked to a given Date Axis, that is, grouped by Date Axis): That is, how on a given Date, we see our sales till snapshot, for our Target Date. We could call this variable “Number of Days Out” (NDO), and we aim to predict at NDO = 0 (NDO = Date - Snapshot).

Our Target would be Sales at NDO = 0 for each Date.

This is a regular scenario, as many enterprise database tables are many times versioned-tables.

In addition, some further regression forecast that could be added to our H2O Tree Model, in order to benefit from its regularisation, and use it as an ensemble for a large number of forecasts:

We could add additional regressors as features, in order to benefit from our H2O tree model regularisation: We have:

facebook’s prophet: https://facebook.github.io/prophet/

google’s bsts: https://cran.r-project.org/web/packages/bsts/index.html

forecast’s / fable’s auto.arima() or ets(): https://cran.r-project.org/web/packages/forecast/index.html

python’s pmdarima: https://github.com/microsoft/forecasting/blob/master/examples/grocery_sales/python/00_quick_start/autoarima_single_round.ipynb

amazon’s gluonts deepar: https://github.com/awslabs/gluon-ts

Currently I apply previous strategies manually for my Time Series Projects, but is always nice to see those automated, so that others can benefit.