2

According to the h2o documentation, I can set keep_cross_validation_predictions = T to get the cross validation predictions from my automl model.

But I cannot get it to work.

Using this example from the documentation

library(h2o)

h2o.init()

# Import a sample binary outcome train/test set into H2O
train <- h2o.importFile("https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv")
test <- h2o.importFile("https://s3.amazonaws.com/erin-data/higgs/higgs_test_5k.csv")

# Identify predictors and response
y <- "response"
x <- setdiff(names(train), y)

# For binary classification, response should be a factor
train[,y] <- as.factor(train[,y])
test[,y] <- as.factor(test[,y])

# Run AutoML for 20 base models (limited to 1 hour max runtime by default)
aml <- h2o.automl(x = x, y = y,
                  training_frame = train,
                  max_models = 20,
                  keep_cross_validation_predictions = TRUE,
                  seed = 1)

After running the model, I tried

h2o.cross_validation_predictions(aml)
h2o.cross_validation_predictions(aml@leader)

h2o.cross_validation_holdout_predictions(aml)
h2o.cross_validation_holdout_predictions(aml@leader)

but none of it works.

edit I am using the latest stable 3.24.02

spore234
  • 3,550
  • 6
  • 50
  • 76
  • Thank you for your question @spore234. Could you please specify which version of `h2o` you were using? – Deil May 17 '19 at 12:15

1 Answers1

2

@spore234 My guess is that your leader is a Stacked Ensemble model and this model is not supposed to have any cross validation predictions.

We should probably provide a meaningful warning for this case.

Let me also to point out that following line:

h2o.cross_validation_predictions(aml)

will throw a meaningful error as user is supposed to pass a H2OModel object but aml is an instance of H2OAutoML class.

Deil
  • 492
  • 4
  • 14
  • 2
    I think I managed to get the CV predictions like this: `model_ids <- as.data.frame(m@leaderboard$model_id)[,1]; se <- h2o.getModel(grep(model_ids[1], model_ids, value = TRUE)[1]); metalearner <- h2o.getModel(se@model$metalearner$name)`, then I calculated the AUC manually – spore234 May 20 '19 at 08:58