Questions tagged [h2o]

Use this tag for questions about the H2O in-memory machine learning platform. Where relevant, add language tags like [r], [python], [scala], or [java].

Best Practices

Always post a Minimal, Complete and Verifiable Example (MCVE) and provide the H2O version number and client type (Python, R, Flow, etc).

If your question is not code related, do not post to Stack Overflow (per Stack Overflow guidelines). If your question is algorithm related, post to Cross-Validated on Stack Exchange using the "h2o" tag. All other questions can be posted to the h2ostream Google group (please do not double-post).

Resources

1875 questions
5
votes
1 answer

R H2O with 32-bit java

I am trying to use H2O package in R with 32-bit java. Unfortunately I am restricted by the comapny's IT to install the 64 bit version of java. How can I make H2O work with 32-bit java, i.e. if possible? OS - Windows 7
dsauce
  • 592
  • 2
  • 14
  • 36
4
votes
2 answers

Is R-Package h2o affected by log4j-vulnerability? (and how to solve)

A vulnerability of log4j became public. Amongst other packages, I am using R shiny and h2o packages. I already found out, that shiny is not affected by the vulnerability, since it uses log4js(see…
Jonas
  • 1,760
  • 1
  • 3
  • 12
4
votes
1 answer

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix, : Unexpected CURL error: getaddrinfo() thread failed to start

I am experiencing a persistent error while trying to use H2O's h2o.automl function. I am trying to repeatedly run this model. It seems to completely fail after 5 or 10 runs. Error in .h2o.__checkConnectionHealth() : H2O connection has been…
jjhold
  • 99
  • 9
4
votes
3 answers

Different results on anomaly detection bettween pycaret and H2O

I'm working on detect anomalies from the following data: It comes from a processed signal of and hydraulic system, from there I know that the dots in the red boxes are anomalies happen when the system fails. I'm using the first 3k records to train…
Luis Ramon Ramirez Rodriguez
  • 9,591
  • 27
  • 102
  • 181
4
votes
1 answer

H2OTypeError: 'training_frame' must be a valid H2OFrame

"After running the following Code…" gbm = h2o.get_model(sorted_final_grid.sorted_metric_table()['model_ids'][0]) params = gbm.params new_params = {"nfolds":5, "model_id":None} for key in new_params.keys(): params[key]['actual'] =…
Jeff King
  • 41
  • 2
4
votes
0 answers

h2o SHAP predict contributions with MOJO

as per release (https://www.h2o.ai/blog/h2o-release-3-26-yau/) it is said that SHAP values can be retrieved from MOJO as well. However in there is no function such as h2o.mojo_predict_contributions or equivalent ? Once model is imported :…
Learner_seeker
  • 544
  • 1
  • 4
  • 21
4
votes
0 answers

H2O mojo_predict_df input columns warning

Getting a strange message from H2O ( h2o_3.26.0.2 ) when predicting using a MOJO file: Detected 14 unused columns in the input data set: {X8,X9,X10,X12,X1,X11,X2,X14,X3,X13,X4,X5,X6,X7} I know that it is not a missing variable issue, as then H2O…
Hanjo Odendaal
  • 1,395
  • 2
  • 13
  • 32
4
votes
0 answers

Is there a function to get the learning curves for training and validation sets in h2o used with R?

I am using h2o and R for a binary classification problem. I was wondering if there is any way to create a learning curve in h2o? I coded some splits myself and I am plotting the curve alright, but I'd like to know if there is a quick recipe…
maop
  • 194
  • 14
4
votes
2 answers

Is there a way to use decision trees with categorical variables without one-hot encoding?

I have a dataset with 200+ categorical variables (non-ordinal) and just a few continuous variables. I have tried to use one-hot encoding but that increases the dimensions by a lot and results in a poor score. It seems like the regular scikit-learn…
4
votes
1 answer

Use pROC package with h2o

I'm doing a binary classification with a GBM using the h2o package. I want to assess the predictive power of a certain variable, and if I'm correct I can do so by comparing the AUC of a model with the specific variable and a model without the…
Zuaro
  • 73
  • 5
4
votes
0 answers

Using the GPU backend in h2o.xgboost in a rocker based Docker container

I've been trying to get GPU support to work for xgboost via h2o in a rocker docker container with little success. Progress so far: GitHub, Docker Hub I have installed CUDA + nvidia-docker on the host machine and CUDA (9.0 - 9.2) in the container.…
Sam Abbott
  • 466
  • 2
  • 9
4
votes
2 answers

H2OResponseError in Grid Search Get Grid Sorting

When I run: data_h = h2o.H2OFrame(data) ### Edit: added asfactor() below to change integer target array. data_h["BPA"] = data_h["BPA"].asfactor() train, valid = data_h.split_frame(ratios=[.7], seed = 1234) features = ["bq_packaging_consumepkg",…
Roy Z
  • 67
  • 1
  • 5
4
votes
1 answer

Python - h2o: How to specifiy column types correctly?

I am trying to import a pandas dataframe into a h2o frame and specify the column types that I want. The problem is am eventually trying to do an .rbind() with two datasets, but sometimes depending on the values of certain columns h2o will force them…
Nate Thompson
  • 625
  • 1
  • 7
  • 22
4
votes
2 answers

H2o Python: Combining XGB Holdout Predictions

When using: "keep_cross_validation_predictions": True "keep_cross_validation_fold_assignment": True in H2O's XGBoost Estimator, I am not able to map these cross validated probabilities back to the original dataset. There is one documentation…
Abhijeet Arora
  • 237
  • 3
  • 13
4
votes
2 answers

How to convert enum datatype into Numric in H20

I have import my dataset into h2o flow, I have one column which is categorical type, I wanna convert this into numerical data type. If I use pandas for this task I'll do like this, df['category_column'] =…
Mohamed Thasin ah
  • 10,754
  • 11
  • 52
  • 111