There are several StackOverflow posts about situation where t.test() in R produce an error saying "data are essentially constant", this is due to that there is not enough difference between the groups (there is no variation) to run the t.test(). (Correct me if there is something else)
I'm in this situation, and I would like to fix this buy altering my data the way the statistical features of the data don't change drastically, so the t-test stays correct. I was wondering what if I add some very little variation to the data (e.g. change 0.301029995663981 to 0.301029995663990), or what else can I do?
For example, this is my data:
# Create the data frame
data <- data.frame(Date = c("2021.08","2021.08","2021.09","2021.09","2021.09","2021.10","2021.10","2021.10","2021.11","2021.11","2021.11","2021.11","2021.11","2021.12","2021.12","2022.01","2022.01","2022.01","2022.01","2022.08","2022.08","2022.08","2022.08","2022.08","2022.09","2022.09","2022.10","2022.10","2022.10","2022.11","2022.11","2022.11","2022.11","2022.11","2022.12","2022.12","2022.12","2022.12","2023.01","2023.01","2023.01","2023.01","2021.08","2021.08","2021.09","2021.09","2021.09","2021.10","2021.10","2021.10","2021.11","2021.11","2021.11","2021.11","2021.11","2021.12","2021.12","2022.01","2022.01","2022.01","2022.01","2022.08","2022.08","2022.08","2022.08","2022.08","2022.09","2022.09","2022.09","2022.09","2022.10","2022.10","2022.10","2022.10","2022.11","2022.11","2022.11","2022.11","2022.11","2022.12","2022.12","2022.12","2022.12","2023.01","2023.01","2023.01","2023.01"),
Species = c("A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A",
"A","A","A","A","A","A","A","A","A","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B",
"B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B","B"),
Site = c("Something","Something","Something","Something","Something","Something","Something","Something","Something","Something","Something",
"Something","Something","Something","Something","Something","Something","Something","Something","Something","Something","Something","Something",
"Something","Something","Something","Something","Something","Something","Something","Something","Something","Something","Something","Something",
"Something","Something","Something","Something","Something","Something","Something","Something","Something","Something","Something","Something",
"Something","Something","Something","Something","Something","Something","Something","Something","Something","Something","Something","Something",
"Something","Something","Something","Something","Something","Something","Something","Something","Something","Something","Something","Something",
"Something","Something","Something","Something","Something","Something","Something","Something","Something","Something","Something","Something",
"Something","Something","Something","Something"),
Mean = c("0.301029995663981","1.07918124604762","0.698970004336019","1.23044892137827","1.53147891704226","1.41497334797082","1.7160033436348",
"0.698970004336019","1.39794000867204","1","0.301029995663981","0.301029995663981","0.477121254719662","0.301029995663981","0.301029995663981",
"0.301029995663981","0.477121254719662","0.301029995663981","0.301029995663981","0.845098040014257","0.301029995663981","0.301029995663981",
"0.477121254719662","0.698970004336019","1.23044892137827","1.41497334797082","1.95904139232109","1.5910646070265","1.53147891704226",
"1.14612803567824","1.57978359661681","1.34242268082221","0.778151250383644","0.301029995663981","0.301029995663981","0.477121254719662",
"0.301029995663981","1.20411998265592","0.845098040014257","1.17609125905568","1.20411998265592","0.698970004336019","0.301029995663981",
"0.698970004336019","0.698970004336019","0.903089986991944","1.14612803567824","0.301029995663981","0.602059991327962","0.301029995663981",
"0.845098040014257","0.698970004336019","0.698970004336019","0.301029995663981","0.698970004336019","0.301029995663981","0.301029995663981",
"0.301029995663981","0.477121254719662","0.301029995663981","0.301029995663981","0.301029995663981","0.301029995663981","0.301029995663981",
"0.602059991327962","0.301029995663981","0.845098040014257","1.92941892571429","1.27875360095283","0.698970004336019","1.38021124171161",
"1.20411998265592","1.38021124171161","1.14612803567824","1","1.07918124604762","1.17609125905568","0.845098040014257","0.698970004336019",
"0.778151250383644","0.301029995663981","0.845098040014257","1.64345267648619","1.46239799789896","1.34242268082221","1.34242268082221",
"0.778151250383644"))
After, I set the factors:
# Set factors
str(data)
data$Date<-as.factor(data$Date)
data$Site<-as.factor(data$Site)
data$Species<-as.factor(data$Species)
data$Mean<-as.numeric(data$Mean)
str(data)
When I try t.test():
compare_means(Mean ~ Species, data = data, group.b = "Date", method = "t.test")
This is the error:
Error in `mutate()`:
ℹ In argument: `p = purrr::map(...)`.
Caused by error in `purrr::map()`:
ℹ In index: 5.
ℹ With name: Date.2021.12.
Caused by error in `t.test.default()`:
! data are essentially constant
Run `rlang::last_trace()` to see where the error occurred.
Similarly, when I use this in ggplot:
ggplot(data, aes(x = Date, y = Mean, fill=Species)) +
geom_boxplot()+
stat_compare_means(data=data,method="t.test", label = "p.signif") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
Warning message:
Computation failed in `stat_compare_means()`
Caused by error in `mutate()`:
ℹ In argument: `p = purrr::map(...)`.
Caused by error in `purrr::map()`:
ℹ In index: 5.
ℹ With name: x.5.
Caused by error in `t.test.default()`:
! data are essentially constant
What is the best solution, which keeps the data still usable in t-test?