2

Looking at an H2o MOJO model, is there a way to figure out the datatypes of the training data it was trained on?

kivk02
  • 599
  • 1
  • 4
  • 16

1 Answers1

2

You can a list of all predictors and categorical from POJO and MOJO. When you get categorical from predictors if the results are "null" then they are considered numbers otherwise enum.

You can use Java code from the following article:

https://aichamp.wordpress.com/2017/08/30/getting-all-categorical-for-predictors-in-h2o-pojo-and-mojo-models/

FYI: There is still an open bug on this issue with POJO so use MOJO instead.

AvkashChauhan
  • 20,495
  • 3
  • 34
  • 65
  • 1
    Thanks Avkash. Will the mojo.getDomainValues() function also work in case the predictor is of any other data type? eg. time, string, UUID? I believe these are actually the datatypes of the hex dataframe (h2o internal dataframe generated after data parsing). Second part of my question is, what are the different datatypes the different columns in the hex dataframe can take- time, string, UUID..I believe numeric = real..? Is this documented anywhere in the H2o documentation? – kivk02 Sep 19 '17 at 18:19
  • 1
    Because the models is built by H2O, the getDomainValues understands all the supported data type used while building model. H2O supports integer & real numbers and enums, strings, UUID and time data type. – AvkashChauhan Sep 19 '17 at 18:47
  • @kivk02 Glad it worked for you. Its always good to accept the answer. – AvkashChauhan Sep 27 '17 at 01:46
  • Does the wrapper have a function to directly print out the datatypes? Right now, I am able to print the domain value and that says whether if it is numeric (it prints out null), but in other cases it prints out the actual domain values for the columns. So, the question is- Is there a function to directly print out the datatypes in other cases too like string, enum, time, uuid etc..? – kivk02 Oct 17 '17 at 20:52