How to find the validation error as a function of the number of epochs on a fine scale using h2o.grid in R

Question

I have a very noisy dataset with 2000 observations and 42 features (financial data) and I'm performing binary classification. Here I'm tuning the network using h2o.grid and providing a validation set. I've set epochs=1000 and I'm imposing to stop the training when the misclassification error does not improve by >=1% for 5 scoring events (stopping_rounds=5, stopping_tolerance=0.01). I'm interested to know what is the value for epochs that minimises the validation error.

hyper_params = list(rho = c(0.9,0.95,0.99),
                 epsilon = 10^(c(-10, -8, -6, -4)),
                 hidden=list(c(64, 64)),
                 activation=c("Tanh", "Rectifier", "RectifierWithDropout"))
grid = h2o.grid("deeplearning", x = predictors, y = response,
                training_frame = tempTrain, validation_frame = tempValid,
                grid_id="h2oGrid10", hyper_params = hyper_params,
                adaptive_rate = TRUE, stopping_metric="misclassification",
                variable_importances = TRUE, epochs = 1000,
                stopping_rounds=5, stopping_tolerance=0.01, max_w2 = 20)

According to this question, the solution should be the following:

gridErr = h2o.getGrid("h2oGrid10", sort_by="err", decreasing=FALSE)
best_model = h2o.getModel(gridErr@model_ids[[1]])
solution = rev(best_model@model$scoring_history$epochs)[1]

Where solution=1000. Anyway, checking the scoring_history we observe the following output that is quite ambiguous.

cbind(best_model@model$scoring_history$epochs,
+       best_model@model$scoring_history$validation_classification_error)
      [,1]      [,2]
 [1,]    0       NaN
 [2,]   10 0.4971347
 [3,]  160 0.4813754
 [4,]  320 0.4770774
 [5,]  490 0.4799427
 [6,]  660 0.4727794
 [7,]  840 0.4713467
 [8,] 1000 0.4727794
 [9,] 1000 0.4713467

In fact, it seems that the global minimum of the validation error is in correspondence of 840 epochs AND 1000 epochs. I've tried with different settings and I still get that the optimal number of epochs corresponds to the initially set number of epochs. Furthermore, I'm quite surprise to observe a so large number of optimal epochs given the conservative values for stopping_rounds=5 and stopping_tolerance=0.01 so I'm wondering whether I'm missing something important. How do I retrieve the optimal number of epochs, possibly in a finer scale (i.e. 1,2,... rather than 10,160,...)?

EDIT: The answer is in slide 8 here. What happens is that the best model is overwritten when performing the last iteration. Anyway, I've played for a while with the parameter train_samples_per_iteration but I'm not still able to observe the evolution of the validation error with the number of epochs in a finer scale. Any idea?

By the way, a grid comparing `activation` of "Rectifier" and "RectifierWithDropout" is not doing anything - they are identical when not using `hidden_drop_ratio` (and if you do use `hidden_drop_ratio` in your grid, then you *must* use "RectifierWithDropout"). — Darren Cook, Aug 29 '16 at 16:10
Good point. I've been trying to tune `hidden_dropout_ratios` during the past few hours but I constantly get this error: `ERRR on field: _hidden_dropout_ratios: Must have 2 hidden layer dropout ratios.` For example, `hyper_params = list(activation=c("RectifierWithDropout"), input_dropout_ratio = c(0, 0.15, 0.3))` with the default parameters in `h2o.grid` returns me the preceding error. Would you mind to explain how to set the values for the hidden drop ratio properly? Thank you very much. — Elrond, Aug 29 '16 at 16:54
As you have two hidden layers, `hidden_dropout_ratios=c(0.15,0.3)`. (`input_dropout_ratio` is a single number, that decides the dropout between the input layer and your first hidden layer.) — Darren Cook, Aug 29 '16 at 17:10
BTW, do you either want to self-answer your original question, OR change your subject to be your new question, e.g. something like "how to force h2o to score after every epoch?" — Darren Cook, Aug 29 '16 at 17:12
I've posted a question related with hidden_dropout_ratios [here](http://stackoverflow.com/questions/39212635/how-to-tune-hidden-dropout-ratios-in-h2o-grid-in-r) and I would really appreciate your help! I'll edit this question as you suggest later on today. — Elrond, Aug 29 '16 at 18:19
@abvaekvni Just a reminder to edit this question's title to match your edit at the end; alternatively remove that edit and instead add a self-answer. — Darren Cook, Sep 04 '16 at 09:15

How to find the validation error as a function of the number of epochs on a fine scale using h2o.grid in R

0 Answers0

Linked