0

I am building a multivariate model for direct time series forecasting, where the goal is to make 4 and 8-step-ahead forecasts using random forest and SVR. The results look very similar to my 1 step-ahead forecast and I am wondering whether my code is sensible or not.

Here is an example for some 4-step-ahead forecasts using random forest in conjunction with the predict function.
As far as I understand the difference between the 1-step-ahead and the 4-step-ahead direct forecast is that instead of the first we feed the fourth row of the test set to the predict function. Meaning in the following example: 

test <- mydata_2diff[(i+4), ]

instead of 

test <- mydata_2diff[(i+1), ]


My code looks as follows:

train_end <- 112 # End of the training set
j <- 1 # Loop counter
k_max <- 10  # Number of RF estimations
pred_rf_4Q_dir <- matrix(0,(nrow(mydata_2diff)-train_end-3), k_max) # Prediction matrix

{
  tic()
  for (i in train_end:(nrow(mydata_2diff)-4)) {
    
    train <- mydata_2diff[1:i, ]     # Training data
    test <- mydata_2diff[(i+4), ]    # Test data
    
    for (k in 1:k_max){
      
      rf_RPI <- randomForest(RPI ~ RGDP + CPI + STI + LTI + UE + SER + SPI + ARH,  
                             data = train, ntree = 500, importance = T)
      
      pred_rf = predict(rf_RPI, newdata=test, predict.all = T)
      
      pred_rf_4Q_dir[j,k] <- pred_rf[["aggregate"]]
      
    }
    j <- j+1
  }
  toc()
}

Is this approach correct or not?
I am grateful for any feedback.
Matthew.M
  • 11
  • 2

0 Answers0