0

After training a model with autoML tool of H2O, I can see the variable importance with saved_model.varimp_plot(). I am curious about the feature engineering part whic H2O claims to do.

I'm trying simple lines of code sapmles in the documentation of H2O.

import h2o
h2o.init()

train_data = h2o.import_file("../full_data.csv")
test_data = h2o.import_file("../201810_pca.csv")

from h2o.automl import H2OAutoML
y = "Label"
x = ['feature0','feature1','feature2','feature3','feature4','feature5','feature6','feature7','feature8','feature9','feature10',
'feature11','feature12','feature13','feature14','feature15','feature16','feature17','feature18','feature19','feature20',
'feature21','feature22','feature23','Amount','DateTime']


aml = H2OAutoML(max_models = 100, max_runtime_secs=100000, seed = 1)
aml.train(x = x, y = y, training_frame = train_data)

lb = aml.leaderboard
lb.head()
lb.head(rows=lb.nrows) # Entire leaderboard

preds = aml.predict(test_data)
h2o.save_model(aml.leader, path = "./Saved_Models")


saved_model = h2o.load_model("./Saved_Models/XGBoost_2_AutoML_20191018_174201")

training_frame = your_model.actual_params['training_frame'] #The part gives error
print(training_frame)

How do I see which features are being used in the trained model? I'd like to see if H2O is extracting and adding new features or not.

I've used my_training_frame = your_model.actual_params['training_frame'] as stated in another question but it gives error: "TypeError: 'property' object has no attribute 'getitem'".

Ege
  • 941
  • 4
  • 17
  • 36
  • @edge can you please post the full code snippet? did you provide a validation frame when you called `.train()` by any chance. Please also note DAI and H2O-3 automl are separate products. AutoML does not do feature engineering for you, that is something in the DAI product. Thanks! – Lauren Oct 24 '19 at 18:09
  • @Lauren I didn't know that, thanks for the information. Do you know any kind of dıcumentation about this topic that I can get more information about the feature engineering? – Ege Oct 25 '19 at 08:56

1 Answers1

0

Quick Note H2O.ai has a few products. The open source platform is called H2O-3 and it contains the AutoML algorithm. AutoML does not currently do feature engineering for you. If you want automatic feature engineering, you might be thinking of H2O's product Driverless-AI.

As for the error you are seeing, this is a bug and you can track the fix here.

Depending on what you pass to the .train() method, you may or may not hit this bug.

Lauren
  • 5,640
  • 1
  • 13
  • 19