R doesn't perform a t.test when there are too few observations. However, I need to compare two surveys, where one survey has information on all items, whereas in the other it is lacking in some variables. This leads to a t.test comparison of e.g. q1
from NA (group 1)
with values (group 2)
.
Basically, I need to find out how the t.test
is performed anyway but reports an error if the requirements are lacking. I need to perform multiple t.tests at the same time (q1-q4
) with grouping variable group
and report the p.values to an output file.
Thanks for your help!
#create data
surveydata <- as.data.frame(replicate(1,sample(1:5,1000,rep=TRUE)))
colnames(surveydata)[1] <- "q1"
surveydata$q2 <- sample(6, size = nrow(surveydata), replace = TRUE)
surveydata$q3 <- sample(6, size = nrow(surveydata), replace = TRUE)
surveydata$q4 <- sample(6, size = nrow(surveydata), replace = TRUE)
surveydata$group <- c(1,2)
#replace all value "6" wir NA
surveydata[surveydata == 6] <- NA
#add NAs to group 1 in q1
surveydata$q1[which(surveydata$q1==1 & surveydata$group==1)] = NA
surveydata$q1[which(surveydata$q1==2 & surveydata$group==1)] = NA
surveydata$q1[which(surveydata$q1==3 & surveydata$group==1)] = NA
surveydata$q1[which(surveydata$q1==4 & surveydata$group==1)] = NA
surveydata$q1[which(surveydata$q1==5 & surveydata$group==1)] = NA
#perform t.test
svy_sel <- c("q1", "q2", "q3", "q4", "group") #vector for selection
temp <- surveydata %>%
dplyr::select(svy_sel) %>%
tidyr::gather(key = variable, value = value, -group) %>%
dplyr::mutate(value = as.numeric(value)) %>%
dplyr::group_by(group, variable) %>%
dplyr::summarise(value = list(value)) %>%
tidyr::spread(group, value) %>% #convert from “long” to “wide” format
dplyr::group_by(variable) %>% #t-test will be applied to each member of this group (ie., each variable).
dplyr::mutate(p_value = t.test(unlist(1), unlist(2))$p.value, na.action = na.exclude)