I'm working on a dataset with several different types of proteins as columns. It kinds of looks like this This is simplified, the original dataset contains over 100 types of proteins. I wanted to see if the concentration of a protein differs by treatments when taking random effect (=id) into consideration. I managed to run multiple repeated ANOVA at once. But I would also like to do pairwise comparisons for all proteins based on the treatment. The first thing came to my mind was using emmeans package, but I had trouble coding this.
#install packages
library(tidyverse)
library(emmeans)
#Create a data set
set.seed(1)
id <- rep(c("1","2","3","4","5","6"),3)
Treatment <- c(rep(c("A"), 6), rep(c("B"), 6),rep(c("C"), 6))
Protein1 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
Protein2 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
Protein3 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
DF <- data.frame(id, Treatment, Protein1, Protein2, Protein3) %>%
mutate(id = factor(id),
Treatment = factor(Treatment, levels = c("A","B","C")))
#First, I tried to run multiple anova, by using lapply
responseList <- names(DF)[c(3:5)]
modelList <- lapply(responseList, function(resp) {
mF <- formula(paste(resp, " ~ Treatment + Error(id/Treatment)"))
aov(mF, data = DF)
})
lapply(modelList, summary)
#Pairwise comparison using emmeans. This did not work
wt_emm <- emmeans(modelList, "Treatment")
> wt_emm <- emmeans(modelList, "Treatment")
Error in ref_grid(object, ...) : Can't handle an object of class “list”
Use help("models", package = "emmeans") for information on supported models.
So I tried a different approach
anova2 <- aov(cbind(Protein1,Protein2,Protein3)~ Treatment +Error(id/Treatment), data = DF)
summary(anova2)
#Pairwise comparison using emmeans.
#I got only result for the whole dataset, instead of by different types of protein.
wt_emm2 <- emmeans(anova2, "Treatment")
pairs(wt_emm2)
> pairs(wt_emm2)
contrast estimate SE df t.ratio p.value
A - B -1.704 1.05 10 -1.630 0.2782
A - C 0.865 1.05 10 0.827 0.6955
B - C 2.569 1.05 10 2.458 0.0793
I don't understand why even if I used "cbind(Protein1, Protein2, Protein3)" in the anova model. R still only gives me one result instead of something like the following
this is what I was hoping to get
> Protein1
contrast
A - B
A - C
B - C
> Protein2
contrast
A - B
A - C
B - C
> Protein3
contrast
A - B
A - C
B - C
How do I code this or should I try a different package/function?
I don't have trouble running one protein at a time. However, since I have over 100 proteins to run, it would be really time-consuming to code them one by one.
Any suggestion is appreciated. Thank you!