1

Disclaimer:

  • Firstly, I have tried to create a mock dataset for a reproducible example, but whenever I create a random set or take a portion of the real data, the model does not converge. I can't see a way of providing the necessary material for a reproducible example without the real data. If deemed necessary and someone is willing to help, I can post the real data.

  • I think this post is addressing a similar idea, but the comments do not explain how to go about 'hacking' this particular coding challenge.

Adding covariates to nlme model in R

  • I am aware of the issues/debate/literature relating to data-dredging, but would like to figure this out regardless.

Overview:

Modeling the percent cover of a sessile species (dependent variable) with explanatory abitoic variables (independent variable) from multiple sites.

With the help of others, I have created the following logistic growth model using the nlme package in R. Percent cover is reliant upon four model parameters. Days is just the time in days when measurements were taken. MaxPop = the maximum population (percent cover - between 0 -100%) per a given site. Days50 = Time in days it take for 50 % of the MaxPop. Hill = Controls the lag of growth and slope through the inflection point. Each of MaxPop, Days50 and Hill are made up of independent abiotic variables, temperature, water acceleration, salinity and pH (centred where necessary).

Global_Model <- nlme(PercentCover ~ I(100 *((inv.logit(exp(MaxPop) *Days^Hill/(exp(Days50)^Hill + Days^Hill))-0.5)*2)),
                 data = Data15,
                 fixed = list(MaxPop ~ I(Temp-14) + Accel + I(Sal-30) + I(pH-8), LogDays50 ~ I(Temp-14) + Accel + I(Sal-30) +I(pH-8), Hill ~ I(Temp-14) + Accel + I(Sal-30) + I(pH-8)),
                 random = MaxPop ~1|Site,
                 start = c(11, -1, -1925,  1.6, 0, 7.4, -0.2, -776, 0.6, 0,  3.9, -.004, 2009, -0.3, 0), control = nlmeControl(maxIter=100), verbose = T)

I began with a predefined set of theoretic models, but I proceeded to develop a global model as all of the abiotic variables were preselected and all have a theoretic reason for inclusion in the model. Hopefully that will reduce the risk of overfitting with a global model.

With the MuMIn package I attempted to use dredge(Global_Model) to run all nested models within the global model, but get "Error in FUN(X[[i]], ...) : subscript out of bounds"

I am not sure how to go about attempting this as each model parameter (MaxPop, Day50, and Hill) is composed of the same four independent variables. I would like to run a dredge that tests every permutation of IVs intra and inter-parameter.

Questions:

Is it possible to carry out a 'dredge-like' process in nlme?

How would one go about it?

______

I would greatly appreciate some advise.

Cheers!

Community
  • 1
  • 1
  • I can't find `nlme` in the list of supported models for `dredge`. That doesn't surprise me. You shouldn't dredge such a model. – Roland Jan 22 '16 at 14:56
  • Thanks for the feedback. I am manually testing nested models, but there are close to 14,000 candidates and I don't want to suggest the Global Model is essentially better than it actually is based on AIC comparison with nested models, just because I didn't select a set of candidate models that are comparable to the Global one in terms of being the most likely. Are there alternatives in different packages? – LeinsterLegion Jan 22 '16 at 15:07
  • Good luck fitting 14,000 models with `nlme`. Reconsider you whole approach. Dredging is not really considered good scientific practice. – Roland Jan 22 '16 at 15:13
  • Some would argue it is a valuable data exploration tool, provided the abitoic variables are all included based on knowledge of the system. I am really just wondering if it's possible to automate the model development process. It doesn't have to be these four variables, I may have to reduce that to carry out a test of all nested models, if it's possible. – LeinsterLegion Jan 22 '16 at 15:22

1 Answers1

0

You would need to write a "wrapper" function around nlme that maps a linear formula (such as y ~ x1 + x2 + x3), to the three nlme components: fixed and random (I assume you want to keep the same model for all), and then feed it to dredge. It's doable, but tricky.

Kamil Bartoń
  • 1,482
  • 9
  • 10
  • Thanks for the answer. From the reading I have done, I figured it would be something like this, but the "wrapper" term is new to me, so thanks for the new lead. Yeah, I would want to keep the fixed and random parameters the same for each model. Given the level of difficulty (due to my still limited coding abilities), questionable nature of this approach, and time constraints, I'll probably leave it. If someone was so kind as to attempt it however, that would be absolutely wonderful. – LeinsterLegion Jan 29 '16 at 21:50