0

I am doing predictive modelling of Multivariate Time series Data in R using various models such as Arima, H2O.Randomforest, glmnet, lm and few other models.

I created a function to select a model of our choice and do prediction.

Model1 <- function(){
  ..
  return()
}
Model2 <- function(){
  ...
  return()
}
 Model3 <- function(){
  ...
  return()
}
main <- function(n){
  if(n == 1) {
   Model1()
  }
  else if(n == 2){
    Model2()
  }
  else if(n == 3){
    Model3()  
  }}

Now I am supposed to automate these models which gives RMSE and MAPE by finding accuracy between predicted and observed value. I would like to provide scores (eg. out of 5) for each based on the performance. For example, if Arima gives a low RMSE than other models, it will be scored high and the second lowest RMSE model will score a less than Arima and so on.

And every time i run those models with different input Data , it must give the mean score of a model. what I mean to say is,

1. for model1 it will give scores of each model, let's say *s1*.
2. for model2 run it give scores of each model, and let's call it *s2*. 

And i want a mean score of that model every time i run it with different input. It is more like scoring and ranking method.

Are there any methods or packages in R that can give a glimpse of how it is done? or any examples? Any suggestions would be very helpful. I have even shared my question here on Cross validated.

Thank you.

dhinar1991
  • 831
  • 5
  • 21
  • 40
  • Just to clarify: you have some dataframe / matrix with `n` observations for `m` time series for `j` motors `motor1, motor2, ... motorj`? You want to automate the task of 1) fitting multiple models to the data of `motor1, motor2, ... motorj` 2) Rank the models by goodness of fit with RMSE/MAPE and 3) return `j` lists with the model ranking for the `j` input motors? – Numb3rs Jul 07 '17 at 10:33
  • Yes. i can use `rank()` to rank the methods . but i want to give scores (eg. out of 5) to every model based on its RMSE/MAPE value, like score cards. – dhinar1991 Jul 07 '17 at 10:45

1 Answers1

0

To the best of my knowledge there is no single one package that does all that for you without some work from your side. You will need to look for packages that provide ways to do what you need.

Since you haven't provided any reproducible data and only voiced a general idea of what you would like as an example I can only give you a broad idea on what to expect, how I would do this sort of thing and what problems you most likely will encounter:

1. Model Fitting

You need to have a good idea of how you want to fit the models to the data. Take ARIMA for example: It has three orders (p, d, q). By what order do you want to fit your data? A simple ARIMA(1,0,1) model? Or do you need a higher order? There are, again, methods of best fit e. g. by fitting different orders and then selecting the order with the best(lowest) AIC. This article from quantstart is a nice example for a univariate series + code example for estimating different orders and picking the most suitable one.
Check out to what degree you need to set this up for your other models (H2O.Randomforest, glmnet, lm, etc.) aswell. Set this process up in functions.

2. Model Selection

So in Step 1 you fitted your various models to the time series data and have different results. Now you need to combine your goodness-of-fit criteria RMSE/MAPE) in a list/vector. Either they are already part of the output-object from your respective model or you need to compute them yourself. If they already are part of the estimation process, all the better. As part of the function for estimating the process, add, if necessary, the computation and then add the results for the model to the previously mentioned list/vector.
Rank the list by your desired criteria (ascending/descending and highest/lowest) and give that list as an output. Additionally you probably want to add the results from the best fit, which you can do by appending the highest ranked result to your output.


Again, without specific code examples and problems you face it's difficult to help you. Try setting up something concrete and if you run into specific problems you can always ask for help here. Providing dummy-data and the code you used will greatly improve your chance on answers to your problems.

Numb3rs
  • 345
  • 2
  • 8