3

I'm trying to run h2o.deeplearning twice, using checkpoint parameter on 2 train sets (using same parameters except different epochs). I'm getting the following error:

Error: 'The columns of the training data must be the same as for the checkpointed model

even-though both sets have same columns. Attached below the relevant code:

model <- h2o.deeplearning(x = 2:785, y = 1, training_frame = train1, 
                      activation = "RectifierWithDropout", 
                      hidden = c(1024,1024,2048),
                      epochs = 10, 
                      l1 = 1e-5, 
                      input_dropout_ratio = 0.2,
                      train_samples_per_iteration = -1, 
                      classification_stop = -1)

model2 <- h2o.deeplearning(x = 2:785, y = 1, training_frame = train2, 
                      checkpoint = model@model_id,
                      activation = "RectifierWithDropout", 
                      hidden = c(1024,1024,2048),
                      epochs = 1000, 
                      l1 = 1e-5, 
                      input_dropout_ratio = 0.2,
                      train_samples_per_iteration = -1, 
                      classification_stop = -1)


> all(colnames(train1)==colnames(train2))
[1] TRUE

> dim(train1)
[1] 54447   785
> dim(train2)
[1] 5553  785

Thanks, Eli.

Amir
  • 10,600
  • 9
  • 48
  • 75
eli
  • 81
  • 5

2 Answers2

2

This has been fixed on the master branch of H2O. The source of the issue was that there are different sets of columns in the train1 and train2 data frames that were constant (all zeros), so different sets of columns got automatically dropped. This caused the algorithm to think that different sets of predictors were being used in the training set and the follow-up training set used in the checkpointed model.

See the JIRA ticket for more information on the fix. You can get the update by installing H2O from source or you can wait until the next nightly release, available here.

Erin LeDell
  • 8,704
  • 1
  • 19
  • 35
1

This might be an overly verbose check that also checks that the same columns are non constant. Try disabling ignore_const_cols to get around the issue.

I filed a JIRA here.

Erin LeDell
  • 8,704
  • 1
  • 19
  • 35