Questions tagged [gbm]

R package gbm, implementing Generalized Boosted Regression Models library.

R package gbm, implementing Generalized Boosted Regression Models library.

This package implements extensions to Freund and Schapire’s AdaBoost algorithm and Friedman’s gradient boosting machine.

Includes regression methods for least squares,absolute loss, t-distribution loss, quantile regression,logistic, multinomial logistic, Poisson, Cox proportional hazards partial likelihood, AdaBoost exponential loss, Huberized hinge loss, and Learning to Rank measures (LambdaMart).

Who's using gbm?

The gbm package is used in examples in Software for Data Analysis by John Chambers.

gbm is also used in Elements of Statistical Learning by Hastie, Tibshirani and Friedman.

Richard A. Berk also uses gbm in his book, Statistical Learning from a Regression Perspective.

Source: gradientboostedmodels

330 questions
4
votes
1 answer

GBM error in classification bernoulli distribution

When running the gbm function for a classification problem. I get the following error: Error in res[flag, ] <- predictions : replacement has length zero I would like to know why I get this error and how to solve it. My data is about 77 numeric…
Heather Clark
  • 175
  • 10
4
votes
2 answers

H2O R Variable Importance Truncated List

I have a data set with over 400 features that I am estimating with GBM using H2O atop R. When I use the variable importance function (h2o.varimp) it only shows me the head and tail of the full ranked variable list. Is there a way to have the entire…
dj_ski_mask
  • 55
  • 1
  • 5
4
votes
2 answers

h2o model not fit in driver node's memory error

I ran GBM model through R code in H2O and got below error. The same code was running fine a couple of weeks. Wondering if this is H2O side error Or configuration on the user system? water.exceptions.H2OModelBuilderIllegalArgumentException: Illegal…
Eric_IL
  • 171
  • 2
  • 10
4
votes
1 answer

How can I offset exposures in a gbm model in R?

I am trying to fit a gradient boosting machine (GBM) to insurance claims. The observations have unequal exposure so I am trying to use an offset equal to the log of exposures. I tried two different ways: Put an offset term in the formula. This…
4
votes
1 answer

(R) Plot dendrograms BRT models from gbm.step

(previously posted here, to the wrong sub, with not enough info, which was closed, I edited, the edits seem to have been deleted, & the post consigned to purgatory, so apologies for re-posting, I don't know whether the previous post can/should be…
dez93_2000
  • 1,730
  • 2
  • 23
  • 34
4
votes
2 answers

GBM model generating NA results

I'm trying to run a simple GBM classification model to benchmark performance against random forests and SVMs, but I'm having trouble getting the model to score correctly. It's not throwing an error, but the predictions are all NaN. I'm using the…
TomR
  • 546
  • 8
  • 19
4
votes
1 answer

What does `train.error` actually represent for gbm?

Consider the short R script below. It seems that boost.hitters$train.error does not match up with either the raw residuals or the squared errors of the training set. I could not find documentation on train.error at all, so I am wondering if anyone…
merlin2011
  • 71,677
  • 44
  • 195
  • 329
3
votes
1 answer

Something is wrong; all the RMSE metric values are missing; using caret train function

I am trying to fit a gbm model using the caret package. I know other people have had the same problem, but all the solutions provided in the comments of those questions have not worked for my error. Here is my reproducible…
GiorgiaA
  • 55
  • 4
3
votes
0 answers

Recipe vs Formula vs X/Y Interface reproducibility for gbm with caret

I have trained the same model on the iris data set to investigate the reproducibility of each method. It seems that there is a discrepency between models when using all.equal() for the models trained with the recipes interface, but not with the…
JFG123
  • 577
  • 5
  • 13
3
votes
1 answer

Internal node predictions of xgboost model

Is it possible to calculate the internal node predictions of an xgboost model? The R package, gbm, provides a prediction for internal nodes of each tree. The xgboost output, however only shows predictions for the final leaves of the model. xgboost…
Zelazny7
  • 39,946
  • 18
  • 70
  • 84
3
votes
1 answer

Server Error Water.exceptions.H2OIllegalArgumentException While Implementing Grid Search using H2O

I am a newbie using H2O. I am trying to run H2OGridSearch with GBM to get my best hyper parameters. I am following the instructions given at H2O-AI Github repo. It worked well when I was trying Regression but now when I am trying classification it…
3
votes
1 answer

How do Gradient Boosted Trees calculate errors in classification?

I understand how gradient boosting works for regression when we build the next model on the residual error of the previous model - if we use for example linear regression then it will be the residual errror as the target of the next model then sums…
3
votes
0 answers

Implausible variable importance for GBM survival: constant difference in importance

I have a question about a GBM survival analysis. I'm trying to quantify variable importances for my variables (n=453), in a data set of 3614 individuals. The resulting graph wi th variable importances looks suspiciously arranged. I have computed…
3
votes
1 answer

Classification Tree Diagram from H2O Mojo/Pojo

This question draws heavily from the solution to this question as a jumping off point. Given that I can use R to produce a mojo model object: library(h2o) h2o.init() airlinedf <-…
RealViaCauchy
  • 237
  • 1
  • 10
3
votes
1 answer

h2o error when run on a subset of the data but runs perfectly on the original data

The error that i am getting is this. The subset[~100k examples] of my data has exactly the same number of columns as the original dataset [400k examples].But it runs perfectly on the original dataset but not on the subset. Traceback (most recent…
YNWA
  • 43
  • 7
1 2
3
21 22