At first I thought it was a random issue, but re-running the script it happens again.
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix, :
Unexpected CURL error: Recv failure: Connection reset by peer
I'm doing a grid search on a medium-size dataset (roughly 40000 x 30) with a Gradient Boosting Machine model. The largest tree in the grid is 1000. This usually happens after running for a couple of hours. I set max_mem_size
to 30Gb.
for ( k in 1:nrow(par.grid)) {
hg = h2o.gbm(training_frame = Xtr.hf,
validation_frame = Xt.hf,
distribution="huber",
huber_alpha = HuberAlpha,
x=2:ncol(Xtr.hf),
y=1,
ntrees = par.grid[k,"ntree"],
max_depth = depth,
learn_rate = par.grid[k,"shrink"],
min_rows = par.grid[k,"min_leaf"],
sample_rate = samp_rate,
col_sample_rate = c_samp_rate,
nfolds = 5,
model_id = p(iname, "_gbm_CV")
)
cv_result[k,1] = h2o.mse(hg, train=TRUE)
cv_result[k,2] = h2o.mse(hg, valid=TRUE)
}