1

I have been using the dredge function in the MuMIn package to conduct model averaging on my global GAM model (using bam from the mgcv package), with a priori selected explanatory variables and two random effects and a negative binomial distribution.

gsGlob <- bam(gs~ species + season + sex + TL2 + year + s(ri, bs="ad") +
                cloud + s(current2) + s(depth2) + DHW + 
                salinity2 + SST.anomaly2 + s(SST.variability2) + wind2 +
                 s(code, bs = 're') + s(station, bs = 're'), 
               family=nb(), data=allVars_node_dat, na.action = "na.fail", discrete = TRUE)

And I'm using pdredge from MuMIn so I can increase the speed of the dredge.

mycluster = makeCluster(5, type = "SOCK")  ## also need snow installed

#data must exported to the cluster - see 'details' https://rdrr.io/cran/MuMIn/man/pdredge.html
clusterExport(mycluster,"allVars_node_dat")


#required packages must be also loaded there
clusterEvalQ(mycluster, library(mgcv))


gsGlob_dredge <- MuMIn::pdredge(gsGlob, mycluster)

The top model has a vastly different AIC to the others, but has -23 degrees of freedom

enter image description here

What does this mean? Should I be ignoring and removing the top model as that doesn't seem right and conducting the model averaging on the other models? Or is it OK to use this as the top model?

The full dataset from the dredge can be found here and the full dataset here

mikejwilliamson
  • 405
  • 1
  • 7
  • 17
  • 1
    Don't do it this way; use `select = TRUE` to put extra penalties on the smooths, use `paraPen` to penalise the parametric effects, and you should include the random effects regardless of how much variance they use, but you can formally look at this through variance components `gam.vcomp()` on the fitted model. At least this way you'd be using your electrons usefully. – Gavin Simpson Aug 23 '21 at 08:37

1 Answers1

1

I tried to make this a comment, but it was too long. I'm not an expert. However, I am fairly certain it has to do with the consolidation of the submodels. The iterative updating employed by BAM creates multiple different values for things like the DF. I am basing this thought process on the article by Wood et al. (2015) and the article by Hauenstein et al. (2016). In Hauenstein et al.'s article, they ran into negative degrees of freedom for some random forest modeling and discuss the how and why of it in section 4.1.

My take on the negation of the degrees of freedom? I don't think that it is unimportant, but the model selection should not be made in this field, in and of itself.

Other things I would look at:

  • since I would have split the data into training and testing sets, I would test the models with the data - is test performance similar?
  • visualize data and possibly the residuals to understand how is it is distributed
  • Is the variance consistent? (AKA homogeneity)
  • visualize the prediction versus the actual values, looking for anything that clearly indicates a more suitable model
  • use MuMIn::model.avg() to compare them

(The links in the narrative will take you to 'free' locations. The DOI links will take you to the journal.)

Wood, S. N., Goude, Y., & Shaw, S. (2015) Generalized additive models for large data sets. Journal of the Royal Statistical Society. Series C (Applied Statistics), 64(1), 139-155. https://doi.org/10.1111/rssc.12068

Hauenstein, S., Wood, S. N., & Dormann, C. F. (2018). Computing AIC for black-box models using generalized degrees of freedom: A comparison with cross-validation. Communications in Statistics - Simulation and Computation, 47(5), 1382-1396. https://doi.org/10.1080/03610918.2017.1315728

Kat
  • 15,669
  • 3
  • 18
  • 51