I am trying to (substantially) accelerate some R code by moving to R+h2o.ai.
I am grouping by a single factor variable but I get error when I try to compute windowed quantiles, skewness, or kurtosis.
Is there a list of summary functions in h2o that are incompatible with the split-apply-combine approach? Does it only apply to sql-analog functions like sum, count, or stdev?
This code fails:
for(i in col_idx_list){
proc_cols_list <- names(df.hex)[i]
group_cols_list <- c("group_variable_factor")
h2o.quantile(x=df.hex[,proc_cols_list])
temp <- h2o.group_by(data=df.hex,
by=group_cols_list,
mean(proc_cols_list),
var(proc_cols_list),
skewness(proc_cols_list),
gb.control=list(na.methods="ignore") )
if(i ==first_index){
df_summs <- temp
} else {
df_summs <- h2o.cbind(df_summs , temp[,2:ncol(temp)])
}
}
This code runs fine:
for(i in col_idx_list){
proc_cols_list <- names(df.hex)[i]
group_cols_list <- c("group_variable_factor")
h2o.quantile(x=df.hex[,proc_cols_list])
temp <- h2o.group_by(data=df.hex,
by=group_cols_list,
mean(proc_cols_list),
var(proc_cols_list),
gb.control=list(na.methods="ignore") )
if(i ==first_index){
df_summs <- temp
} else {
df_summs <- h2o.cbind(df_summs , temp[,2:ncol(temp)])
}
}
Error text (truncated for brevity):
ERROR: Unexpected HTTP Status code: 400 Bad Request (url = http://localhost:54321/99/Rapids)
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, :
ERROR MESSAGE:
No enum constant water.rapids.ast.prims.mungers.AstGroup.FCN.skewness
ERROR: Unexpected HTTP Status code: 404 Not Found (url = http://localhost:54321/3/Frames/RTMP_sid_8712_17?row_count=10)
ERROR MESSAGE:
Object 'RTMP_sid_8712_17' not found for argument: key