Boosted regression trees - deviance values

Question

I am fitting a BRT model using gbm package in R for the following model:

height above ground ~ Age + season + habitat + timeofday

The height above ground is a continuous variable, and so is timeofday. Season and habitat are binomial variables.

I get a very high deviance and I don't know why... Can somebody help me with the parameters?

> M1 <- gbm.step(data=data, gbm.x = 2:5, gbm.y = 1,
+                family = "gaussian", tree.complexity = 4,
+                learning.rate = 0.01, bag.fraction = 0.50,
+                tolerance.method = "fixed",
+                tolerance = 0.01)


 GBM STEP - version 2.9 

Performing cross-validation optimisation of a boosted regression tree model 
for HAG and using a family of gaussian 
Using 15439 observations and 4 predictors 
creating 10 initial models of 50 trees 

 folds are unstratified 
total mean deviance =  55368.22 
tolerance is fixed at  0.01 
ntrees resid. dev. 
50    51050.65 
now adding trees... 
100   48935.65 
150   47805.14 
200   47193.43 
250   46841.71 
300   46631.33 
350   46498.56 
400   46418.58 
450   46371.7 
500   46336.54 
550   46317.53 
600   46309.25 
650   46300.57 
700   46296.82 
750   46297 
800   46299.11 
850   46297.7 
900   46298.34 
950   46292.32 
1000   46297.62 
1050   46295.78 
1100   46301.32 
1150   46306.59 
1200   46312.55 
1250   46314.67 
1300   46318.64 
1350   46321.38 
1400   46324.33 
1450   46322.9 
fitting final gbm model with a fixed number of 950 trees for HAG

mean total deviance = 55368.21 
mean residual deviance = 45913.34 

estimated cv deviance = 46292.32 ; se = 1366.501 

training data correlation = 0.413 
cv correlation =  0.406 ; se = 0.008 

elapsed time -  0.02 minutes

Hey, the deviance is always dependent on the scale of the your response variable. Deviance in this case is the mean of the residuals squared. mean(M1$residuals^2) in this case — StupidWolf, Mar 02 '20 at 13:42
So if your response is on a lower scale... the deviance will be lower. What you should look at is how much the deviance has been reduced. In this case, it's from 55368 to 45913 which is not a lot — StupidWolf, Mar 02 '20 at 13:43
You can try to scale your response variable (scale function) in R, but I doubt you can get more predictive power since you only have 4 predictors.. — StupidWolf, Mar 02 '20 at 13:46
@StupidWolf Do you think a delta deviance of 9455 is acceptable to present? (Btw, I scaled it and now it is from 1 to 0.83). Do you want to write a complete answer? — JMarcelino, Mar 02 '20 at 14:16
Yes I can write an answer... So with scaling, you can see that it doesn't change much, the error goes down by say 20-30%.. but it might be very good for your data. — StupidWolf, Mar 02 '20 at 14:28
So it really depends on what you want to present, but the deviance value is not intuitive. What you can calculate is the mean absolute error, or the correlation between your prediction and observed... — StupidWolf, Mar 02 '20 at 14:30

score 2 · Accepted Answer · answered Mar 02 '20 at 14:49

The deviance in a gbm is the mean squared error, and it will depend on the scale your dependent variable is in.

For example:

library(dismo)
library(mlbench)
data(BostonHousing)
idx=sample(nrow(BostonHousing),400)
TrnData = BostonHousing[idx,]
TestData = BostonHousing[-idx,]

The dependent variable is the last column "medv" , so we run a gbm on the raw data:

gbm_0 = gbm.step(data=TrnData,gbm.x=1:13,gbm.y=14,family="gaussian")

mean total deviance = 84.02 
mean residual deviance = 7.871 

estimated cv deviance = 13.959 ; se = 1.909 

training data correlation = 0.952 
cv correlation =  0.916 ; se = 0.012

You can see the mean deviance can also be calculate from your residuals (which is y - y predicted ):

mean(gbm_0$residuals^2)
[1] 7.871158

It is always good to use the testData (which the model has not been trained on). You can also check how close it is to the actual data using either correlation or MAE (mean absolute error):

pred = predict(gbm_0,TestData,1000)    
# or pearson if you like
cor(pred,TestData$medv,method="spearman")
[1] 0.8652737
# MAE
mean(abs(TestData$medv-pred))
[1] 2.75325

Visualize it, good correlation makes sense that your predictions are on average off by 3.

Now if you change the scale of your dependent variable, the deviance changes by your interpretation from correlation or MAE will stay the same:

TrnData$medv = TrnData$medv*2
TestData$medv = TestData$medv*2
gbm_2 = gbm.step(data=TrnData,gbm.x=1:13,gbm.y=14,family="gaussian")

mean total deviance = 336.081 
mean residual deviance = 30.983 

estimated cv deviance = 57.52 ; se = 10.254 

training data correlation = 0.953 
cv correlation =  0.911 ; se = 0.019 

elapsed time -  0.2 minutes

pred = predict(gbm_2,TestData,1000)    
cor(pred,TestData$medv,method="spearman")
[1] 0.8676821
mean(abs(TestData$medv-pred))
[1] 5.47673

Boosted regression trees - deviance values

1 Answers1