How do you incorporate presence/absence of 5 species into a Linear Mixed Effect Model?

Question

I have a dataset of the Visitation Rate(VR) of 5 bird species on 3 species of flower, and the measured Pollen Deposition(PD) of each flower. Each flower species is represented by 20 individuals. Not all bird species visited every individual flower. I want to fit a linear mixed model with PD as dependent variable, VR and presence/absence of bird species as fixed variables, and individal plant ID as random effect. Sort of like PD ~ VR + bird.species (random=ID). However, I dont know how to do that, without making a variable for the presence/absence for each bird species. My current dataset looks like this:

enter image description here

So far I have tried the following code for the linear model, but writing it like this doesn't feel right. Also, I am unsure how to read the results when so many fixed variables are included.

# Linear mixed model
nlm1 <- lme(PD.adjusted ~ VR + 
    Purple + Green + Crested + Blue + Quit + Bull, data = data,   
     random = ~ 1 | Plant.ID)
 
# Store all model parameters in an object
m1_total <- summary(nlm1)

Can you please edit your question to include the information in your data set in a *textual* form (e.g., cut-and-pasted in a code block delimited by ``` ) ? It is more searchable and friendlier for screen-readers that way ... — Ben Bolker, Apr 13 '23 at 18:14
Please provide enough code so others can better understand or reproduce the problem. — Community, Apr 14 '23 at 10:30

score 0 · Answer 1 · answered Apr 13 '23 at 18:27

(Followup questions should probably go to CrossValidated rather than Stack Overflow ...)

There are a few different issues here:

since (it appears from the data you posted) you have only one observation of PD.adjusted per Plant ID, it won't work (and is unnecessary) to include Plant ID as a random effect. Random effects are needed when you have multiple observations of at least some groups
what is your concern about including many species as fixed effects? The reported parameters will describe the expected difference in pollination deposition in the presence/absence of a given species.
if you are concerned about the unwieldiness of writing out many species names in the formula, you could do something like

spnames <- setdiff(names(my_data), c("PD.adjusted", "Plant.ID", "VR"))
fixed_form <- reformulate(c("VR",spnames), response = "PD.adjusted"))
lme(fixed_form, ...)

you could in principle fit a multi-membership model where the species effects are treated as a random effect: you would need to use something like this package. I'm of two minds about this: it sends you down a rabbit hole of using more advanced statistical methods and a package that's under development, but it could be helpful because your current setup runs afoul of the rule of thumb that you need about/at least 10 observations per estimated parameter in order to expect reasonably precise answers. (With 20 observations per flower species, you have only 20/6 observations per parameter ... you didn't say in your question how you're going to handle the different species ...)

How do you incorporate presence/absence of 5 species into a Linear Mixed Effect Model?

1 Answers1