1

I am trying to analyse a split plot design for a plant growth experiment with these variables:

  • Biomass (dependent variable)
  • Transect (sub plot factor with three levels)
  • Treatment (main plot factor with two levels)
  • Block (2 blocks in total, serving as replicates of the treatment)
  • Location (multiple locations within each transect point)

I know what the random effect structure should look like. However, I can’t work out how to write this in R script. Could someone please help me? It’s probably very easy, but I have been looking for hours and hours and can’t find it.

Random effects should be:

  • Block
  • Interaction Block and Treatment
  • Location nested within Transect
  • (Location nested within Transect), interaction with Treatment

So perhaps something like:

(1|block) + (1|block*treatment) + (1|location:transect) +
  (1|(location:transect)*treatment)
Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
Ecologist
  • 21
  • 5
  • 1
    A couple of clarifying questions: 1) Biomass is your dependent variable and not independent variable - Correct? 2) Is Location an independent variable or just repeated measurements within each Treatment-Transect combination. 3) If Location is an independent variable wouldn't this now be a split-split-plot design. 4) This questions is better asked on Cross validate as this is more of a stats question and not a programming question. – Dave2e Dec 27 '20 at 14:02
  • Thanks for your reply. 1) Oops, Biomass is indeed the dependent variable. 2) Seeds were collected from plants along three independent transects and within each transects from different sublocations. 3) I know that the design is correct, since my professor told me so ;). 4) Thank you, I asked the question there – Ecologist Dec 27 '20 at 14:47
  • Please ave a look at https://stat.ethz.ch/~meier/teaching/anova/split-plot-designs.html and `emmeans` package https://cran.r-project.org/web/packages/emmeans/index.html – Abdur Rohman Dec 27 '20 at 14:48

1 Answers1

3

OK, I'll take a shot at this.

First: in 'modern' mixed model approaches it is not practical to treat a two-level categorical variable as random. In 'classical' method-of-moment/SSQ ratio approaches it works, although the power is terrible; in modern methods you will end up with 'singular models' (do a web search for "GLMM FAQ" or search here and on CrossValidated for more info). (The exception to this statement is if you go full-Bayesian and put regularizing priors on the random-effects parameters ...) Therefore, I'm going to take block as a fixed effect.

This would be (I think) your maximal model:

~ treatment*block + (treatment|transect/location)
  • treatment*block (expands to 1 + block + treatment + block:treatment: the baseline biomass (intercept) could differ between blocks, the treatments could differ, the treatment effect could differ between blocks
  • (treatment|transect/location) (expands to (1+treatment|transect) + (1+treatment|transect:location)); the intercept and treatment effect vary among transects and among locations within transects. (This assumes that transects are uniquely coded between blocks, i.e. you don't have a transect 001 in both blocks, rather they are labeled something like A001 and B001. If not, you need something like (1+treatment|block:(transect/location)) ...

This also assumes you have multiple observations per transect/location/treatment combination. If not (if each treatment is observed only once per location), then the full interaction will be confounded with the residual variation and you instead need something like (1+treatment|transect) + (1|transect:location).

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453