0

for a research project I need to run an ANOVA test to see the statistical significance of the differences between some treatments. The experiment consisted in inoculating some bacteria in different tubes containing different treatments with different concentrations. My dependent variable is the value of Optical Density 660 measured on the spectrophotometer, I measured the OD 13 times over time at different times.

Here is the dataset, i'll gave you all dataset, it is not so big:

od34_stat1 <- data.frame(
          OD = c(0.032667,0.09,0.157,0.184,0.345667,
                 0.4445,0.47725,0.53925,0.74,0.750667,0.859167,0.880333,
                 0.8275,0.034667,0.0935,0.146,0.1725,0.522167,0.5865,0.71075,
                 0.69875,0.927,0.929667,1.063167,1.037333,0.973,0.031167,
                 0.1045,0.139,0.1665,0.425667,0.523,0.69875,0.80575,
                 1.0435,0.994667,1.085667,1.215333,1.1145,0.034667,0.1085,
                 0.1285,0.1645,0.349667,0.474,0.74075,0.78125,1.0815,
                 0.937167,1.045667,1.104333,0.9555,0.028167,0.065,0.13,0.1715,
                 0.331667,0.4015,0.45775,0.54425,0.811,0.739167,0.797167,
                 0.773333,0.6905,0.021167,0.0835,0.131,0.1585,0.279167,
                 0.384,0.40225,0.46975,0.646,0.625667,0.684667,0.701333,
                 0.5885,0.015667,0.0655,0.086,0.12,0.191667,0.261,0.29875,
                 0.35825,0.446,0.411167,0.364667,0.369333,0.31),
   Treatment = as.factor(c("0_CNTRL","0_CNTRL",
                           "0_CNTRL","0_CNTRL","0_CNTRL","0_CNTRL","0_CNTRL",
                           "0_CNTRL","0_CNTRL","0_CNTRL","0_CNTRL",
                           "0_CNTRL","0_CNTRL","10_TOX","10_TOX","10_TOX","10_TOX",
                           "10_TOX","10_TOX","10_TOX","10_TOX","10_TOX",
                           "10_TOX","10_TOX","10_TOX","10_TOX","25_TOX",
                           "25_TOX","25_TOX","25_TOX","25_TOX","25_TOX","25_TOX",
                           "25_TOX","25_TOX","25_TOX","25_TOX","25_TOX",
                           "25_TOX","50_TOX","50_TOX","50_TOX","50_TOX",
                           "50_TOX","50_TOX","50_TOX","50_TOX","50_TOX",
                           "50_TOX","50_TOX","50_TOX","50_TOX","10_CNTRL",
                           "10_CNTRL","10_CNTRL","10_CNTRL","10_CNTRL","10_CNTRL",
                           "10_CNTRL","10_CNTRL","10_CNTRL","10_CNTRL",
                           "10_CNTRL","10_CNTRL","10_CNTRL","25_CNTRL","25_CNTRL",
                           "25_CNTRL","25_CNTRL","25_CNTRL","25_CNTRL",
                           "25_CNTRL","25_CNTRL","25_CNTRL","25_CNTRL",
                           "25_CNTRL","25_CNTRL","25_CNTRL","50_CNTRL","50_CNTRL",
                           "50_CNTRL","50_CNTRL","50_CNTRL","50_CNTRL",
                           "50_CNTRL","50_CNTRL","50_CNTRL","50_CNTRL","50_CNTRL",
                           "50_CNTRL","50_CNTRL")),
        Time = as.factor(c("0","2","4","6",
                           "70","94","478","496","568","616","736","784",
                           "808","0","2","4","6","70","94","478","496",
                           "568","616","736","784","808","0","2","4","6",
                           "70","94","478","496","568","616","736","784",
                           "808","0","2","4","6","70","94","478","496",
                           "568","616","736","784","808","0","2","4",
                           "6","70","94","478","496","568","616","736",
                           "784","808","0","2","4","6","70","94","478",
                           "496","568","616","736","784","808","0","2","4",
                           "6","70","94","478","496","568","616","736",
                           "784","808"))
)

So, what I tried to do is a repeated measures anova, taking into account that I measured the OD over time, time is my repeated measures factor (?).

I would need to see if there are statistically significant differences between the treatment groups (e.g. Is there a significant difference between 0_CNTRL and 25_TOX?). Initially I found a code where it correctly performs the ANOVA in repeated measures but it shows me the differences between the time points: then it tells me if there is a difference between Time 4 and Time 6 etc. but it is not the question that I need and above all the result is too dispersive.

This is the original code (i followed this guide: https://www.datanovia.com/en/lessons/repeated-measures-anova-in-r/#one-way-repeated-measures-anova):

library(tidyverse)
library(ggpubr)
library(rstatix)
library(ggplot2)

##Factors
od34_stat1$Treatment <- as.factor(od34_stat1$Treatment)
od34_stat1$Time <- as.factor(od34_stat1$Time)

#Interactionplot - Boxplot
bxp34 <- ggboxplot(od34_stat1, x = "Time", y = "OD", add = "point")
bxp34

##Check assumptions: Outliers
od34_stat1 %>%
  group_by(Time) %>%
  identify_outliers(OD)

##Check assumptions: Normality
od34_stat1 %>%
  group_by(Time) %>%
 shapiro_test(OD)
#OR
ggqqplot(od34_stat1, "OD", facet.by = "Time")

#Computing One-Way repeated measure ANOVA
od34.aov <- anova_test(data = od34_stat1, dv = OD, wid = Treatment, within = Time)
get_anova_table(od34.aov)

# Pairwise comparisons
od34.pwc <- od34_stat1 %>%
  pairwise_t_test(
    OD ~ Time, paired = TRUE,
    p.adjust.method = "bonferroni"
    )
od34.pwc

##Creating Report
od34.pwc <- od34.pwc %>% add_xy_position(x = "Time")
bxp34 + 
  stat_pvalue_manual(od34.pwc) +
  labs(
    subtitle = get_test_label(od34.aov, detailed = TRUE),
    caption = get_pwc_label(od34.pwc)
  )

Okay. Here is my problem, now the output is the "Time" factor. However, the guide uses a dataset where there are only 3 times of measurement of the dependent variable, while I measured 13 times. Moreover, I think that the intent of the guide is precisely to see the differences over time, while mine is to see the difference between the Treatments whose OD measure has been measured over Time..

So what I thought, as an rstudio noob, is change the code: "Time" to "Treatment". This way my output is just what I would need. My concern is that by changing these factors the result is clear but doesn't make logical sense.

Reviewed code:

#Interactionplot - Boxplot
bxp34_1 <- ggboxplot(od34_stat1, x = "Treatment", y = "OD", add = "point")
bxp34_1

##Check assumptions: Outliers
od34_stat1 %>%
  group_by(Time) %>%
  identify_outliers(OD)

##Check assumptions: Normality
od34_stat1 %>%
  group_by(Treatment) %>%
 shapiro_test(OD)
#OR
ggqqplot(od34_stat1, "OD", facet.by = "Treatment")

#Computing One-Way repeated measure ANOVA
od34.aov_1 <- anova_test(data = od34_stat1, dv = OD, wid = Time, within = Treatment)
get_anova_table(od34.aov_1)

# Pairwise comparisons
od34.pwc_1 <- od34_stat1 %>%
  pairwise_t_test(
    OD ~ Treatment, paired = TRUE,
    p.adjust.method = "bonferroni"
    )
od34.pwc_1

##Creating Report
od34.pwc_1 <- od34.pwc_1 %>% add_xy_position(x = "Treatment")
bxp34_1 + 
  stat_pvalue_manual(od34.pwc_1) +
  labs(
    subtitle = get_test_label(od34.aov_1, detailed = TRUE),
    caption = get_pwc_label(od34.pwc_1)
  )

This way my graphical output (od34.pwc_1) allows me to explain the statistical significance of the difference between treatments.

I hope I have summarized all the doubt correctly. What do you think? Is it right to do this? And if it is not correct, What would you recommend to analyze and visualize the difference between these treatments?

Phil
  • 7,287
  • 3
  • 36
  • 66
  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. – Community Feb 19 '23 at 18:48

0 Answers0