0

I am trying to use xgb.plot.tree to look at my model that I generated with xgb.train, but when I use xgb.plot.tree(model=xgb_model) RStudio keeps crashing.

Is there a known issue with this function? My model looks fine, so I'm not sure what is happening. Here's my code, in case it helps.

xgb_model = xgb.train(params = list(objective="binary:logistic", eta=0.25, max.depth=3, nthread=3),
                    data = test_data.xgbdm, 
                    nrounds = 2500,
                    early.stop.round = NULL)
xgb.plot.tree(model=xgb_model)
smci
  • 32,567
  • 20
  • 113
  • 146
jgadoury
  • 293
  • 2
  • 13
  • Can you make your example reproducible (start with loading libraries, include data, etc -- http://stackoverflow.com/help/mcve)? Does your RStudio crash when you use a builtin dataset? – Hack-R Sep 09 '16 at 00:11
  • 2
    You will have 2500 trees, each with a maximum depth of 3. Perhaps consider setting `n_first_tree`. This will allow you to see the first `n` trees. For example, `xgb.plot.tree(model = xgb_model, n_first_tree = 5)` will show the first 5 trees. Try this and let us know if it still crashes (does it actually crash or just stall because it's plotting?). It could perhaps be the large number of trees that need to be plotted. – jav Sep 09 '16 at 01:31
  • @jav Thanks, that did the trick. I didn't know that the function was attempting to plot all trees, just the final one. If I plot the first one, is that considered the best tree? – jgadoury Sep 09 '16 at 14:34
  • 1
    @jgadoury It is difficult in my opinion in general to say which tree is the best tree. Remember that in a boosted tree model, the next tree is built on the residuals of the previous tree -- The first tree is built on the response, the second is built on the residuals of the first tree, etc. You can perhaps add a watchlist on your training dataset (`watchlist = list(test_data.xgbdm)`) to see the change in the optimization criteria as the trees are fitted. What is your eventual aim here? – jav Sep 09 '16 at 15:01

0 Answers0