Error Correction methodologies Time Series Forecast

Question

Do you have any readings recommendation on correcting forecast bias? For example, I use an ARIMA model to predict a time series. Is there a way based on the backtesting results to correct the bias of the forecast?

welcome to the worlds of StackOverflow. You might already have seen that some moderators are "keen" on penalising posts, that do not meet a StackOverflow standard of a Minimum Complete Verifiable Example of code ( a.k.a. MCVE related Question ). You might opt to update / edit your question so as to meet such pratice ( ideally before any such adverse effect takes place ). The best would be to read StackOverflow do-s & don't-s so as to learn what the community rules have been set and to find your own way, how to live within these. **Anyway enjoy being new contributing member of StackOverflow** — user3666197, Oct 31 '15 at 19:02

user3666197 · Accepted Answer · 2015-10-31T18:56:50.687

3

How to handle an all present `Bias` / `Overfit` struggle?

Using a tactical methodology:

one principal approach to this is to systematically tune a Predictor ( be it ARIMA or some other ) via a two step approach.

You have to split available DataSET into two parts, so as to emulate a near "Future", and "hide" the -- say about 20-30% of the observations -- second part of the DataSET from a process of [1] Training and find it's use in a step [2] called CrossValidation of predictions.

This methodology allows one to search both the StateSPACE of a Predictor engine's configurations and data-related bias/overfit. Some use only the former part of the minimiser search ( lowest error / highest utility function ), some only the latter ( alike Leo Breiman's RandomForest modification of ensemble based method ) and some use both.

Train a pre-configured Predictor on aTrainingSubPartOfAvailableDataSET
Once such a configuration of a Predictor got trained, cross-validate this configuration's ability to predict against aCrossValidationSubPartOfAvailableDataSET not seen in the process of training (Step 1.) to observe the Bias / Overfit artefacts and proceed towards the lowest Cross-Validation error / best generalisation area of plausible configuration settings.

edited Oct 31 '15 at 18:56

answered Oct 31 '15 at 18:42

user3666197

1
6
50
92

Thanks!. I am doing the cross-validation with a backtesting exercise (like a leave one out exercise). And then run a simulation (like a test data set). So I divided the data in 3 parts. But i am wondering. If just using error measures is the correct approach. I should be able to forecast some forecast error. I am saying this rulling out confident intervals. – donpresente Oct 31 '15 at 20:09
@donpresente **Oh yes, this is definitely possible**. Once your methodology keeps fair the process of **separation** between **`aTrainingSubPartOfAvailableDataSET`** for initial training and a part emulating an Out-of-sample examples for validation purpose for getting the best learner ( generalisation-capable Predictor ), one might employ **Hoeffding's Inequality** which is exactly limiting the future predictions' errors of such trained Predictor. – user3666197 Nov 23 '15 at 08:05
Is that bound tight? Doesnt make assumptions on the errors to be gausian? – donpresente Nov 23 '15 at 18:48
Hoeffding bound formulates upper bounds for a probability that an out-of-sample-example prediction will result in an error greater than a certain "tolerable" treshold. No assumption about a distribution thereof, but a certainty about progressively decreasing such probability is the "weapon" for tightening – user3666197 Nov 23 '15 at 21:29
Have you use this bound to make a correction on the forecast? – donpresente Nov 25 '15 at 12:00
Yes. Ignoring it would be self-destructive. – user3666197 Nov 25 '15 at 13:01
Btw, could you, @donpresente, kindly explain, if your ARIMA implementation is an OSS-published or a home-brew one? – user3666197 Nov 25 '15 at 13:06
Sure. Using Arima() o auto.arima() from Forecast package. The correction that you make is "plus or minus something (a quantity)" or is more elaborated? Thanks for your help – donpresente Nov 26 '15 at 09:09
I knew about R package for arima, interested in python or C port thereof. – user3666197 Nov 26 '15 at 12:04
**Hoeffding bound** expresses an upper bound for a probability Pr( | E_in - E_oos | > EPSILON ), that a prediction error E_oos, for an out-of-sample example, could fall farther than a tolerable prediction error EPSILON from a known training error E_in elaborated during a learner training process. – user3666197 Nov 26 '15 at 12:10
1

Have you use http://statsmodels.sourceforge.net/ for Python? I did not try it. Because i use python for text mining or classification problem only. – donpresente Nov 26 '15 at 12:52
Thanks for copying the bound. Yes i see your point. But you make a correction to the forecast right? given the fact that you bound the error? That was my initial question for you about forecast correction. – donpresente Nov 26 '15 at 13:01

Error Correction methodologies Time Series Forecast

1 Answers1

How to handle an all present Bias / Overfit struggle? Using a tactical methodology:

How to handle an all present `Bias` / `Overfit` struggle?

Using a tactical methodology: