1

I am using a H2ORandomForestEsimator. What is the default target metric that H2O models use for their predict() method? https://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/modeling.html#h2o.automl.H2OAutoML.predict

Is there a way to set this? (Eg. to use one of the other metric maximizing thresholds that can be seen when looking at the results of get_params() method)

Currently am doing something like...

df_preds = mymodel.predict(df)
activation_threshold = mymodel.find_threshold_by_max_metric('f1', valid=True)
# adjust the predicted label for the desired metric's maximizing threshold
df_preds['predict'] = df_preds['my_positive_class'].apply(lambda probability: 'my_positive_class' if probability >= activation_threshold else 'my_negative_class')

see

lampShadesDrifter
  • 3,925
  • 8
  • 40
  • 102

1 Answers1

1

There's no concept of a "target metric" when generating predictions, since you're just predicting the response for a row of data (there's no scoring here).

Edit: Thanks for clarifying your question. If you want to change how the threshold is generated, then what you're doing above is a good solution. If you have a suggestion for a utility function that would make this more straight-forward, please file a JIRA with your idea (it could definitely be improved).

Erin LeDell
  • 8,704
  • 1
  • 19
  • 35
  • I see, so you're saying that I'd have to do something like in the updated question, correct (since I see from the docs that models use whatever `stopping_metric` is set to https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/algo-params/stopping_metric.html)? BTW, should I be using `>` or `>=` (couldn't tell from the docs)? – lampShadesDrifter Jul 16 '21 at 03:34