0

I have a dataset of the Visitation Rate(VR) of 5 bird species on 3 species of flower, and the measured Pollen Deposition(PD) of each flower. Each flower species is represented by 20 individuals. Not all bird species visited every individual flower. I want to fit a linear mixed model with PD as dependent variable, VR and presence/absence of bird species as fixed variables, and individal plant ID as random effect. Sort of like PD ~ VR + bird.species (random=ID). However, I dont know how to do that, without making a variable for the presence/absence for each bird species. My current dataset looks like this:

enter image description here

So far I have tried the following code for the linear model, but writing it like this doesn't feel right. Also, I am unsure how to read the results when so many fixed variables are included.

# Linear mixed model
nlm1 <- lme(PD.adjusted ~ VR + 
    Purple + Green + Crested + Blue + Quit + Bull, data = data,   
     random = ~ 1 | Plant.ID)
 
# Store all model parameters in an object
m1_total <- summary(nlm1)
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Can you please edit your question to include the information in your data set in a *textual* form (e.g., cut-and-pasted in a code block delimited by ``` ) ? It is more searchable and friendlier for screen-readers that way ... – Ben Bolker Apr 13 '23 at 18:14
  • Please provide enough code so others can better understand or reproduce the problem. – Community Apr 14 '23 at 10:30

1 Answers1

0

(Followup questions should probably go to CrossValidated rather than Stack Overflow ...)

There are a few different issues here:

  • since (it appears from the data you posted) you have only one observation of PD.adjusted per Plant ID, it won't work (and is unnecessary) to include Plant ID as a random effect. Random effects are needed when you have multiple observations of at least some groups
  • what is your concern about including many species as fixed effects? The reported parameters will describe the expected difference in pollination deposition in the presence/absence of a given species.
  • if you are concerned about the unwieldiness of writing out many species names in the formula, you could do something like
spnames <- setdiff(names(my_data), c("PD.adjusted", "Plant.ID", "VR"))
fixed_form <- reformulate(c("VR",spnames), response = "PD.adjusted"))
lme(fixed_form, ...)
  • you could in principle fit a multi-membership model where the species effects are treated as a random effect: you would need to use something like this package. I'm of two minds about this: it sends you down a rabbit hole of using more advanced statistical methods and a package that's under development, but it could be helpful because your current setup runs afoul of the rule of thumb that you need about/at least 10 observations per estimated parameter in order to expect reasonably precise answers. (With 20 observations per flower species, you have only 20/6 observations per parameter ... you didn't say in your question how you're going to handle the different species ...)
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453