1

I am attempting to generalise some repeated measures ANOVA code for a string input of outcome variable and grouping variable as shown below:

outcome_var<-"outcome_varnam1"
grouping_var <- "grouping_varnam1"

Therefore in dplyr i can call the appropriate dataframe columns using

!!as.name(outcome_var)

this works for most functions but throws up an error on grouped pairwise t-test functions

Error: Strings must match column names. Unknown columns: !!as.name(grouping_var)

I am wondering how to call the variables as column names in this function without explicitly using the column object names, as the full ANOVA anlysis code is long and i wish to repeat this for dozens of different outcome variables and group factors. Using get() or sym() does not work for me. Below is the full sample code. I hope the given data snippet gives enough info.

temp<-data.frame(Subj.ID = c("a", "a", "a", "a", "a", "a"), 
                  timepoint = c("101", "102", "103", "104", "105", "106"),
                  grouping_varnam1 = c("Placebo", "Placebo", "Placebo", "Placebo", "Placebo", "Placebo"),
                  outcome_varnam1 = c(12.6, 9.6, 16.4, NA, 43.1, NA))
attach(temp)
outcome_var<-"outcome_varnam1"
grouping_var<-"grouping_varnam1"

#these lines for testing assumptions of anova work fine using the format:
temp %>% group_by(!!as.name(grouping_var)) %>% get_summary_stats(!!as.name(outcome_var), type = "mean_sd")
temp %>% group_by(!!as.name(grouping_var)) %>% identify_outliers(!!as.name(outcome_var))
temp %>% group_by(!!as.name(grouping_var)) %>% shapiro_test(!!as.name(outcome_var))


#this line for a pairwise t-test grouped by timepoint does not work:
temp %>%
      group_by(timepoint) %>%
      pairwise_t_test(!!as.name(outcome_var) ~ !!as.name(grouping_var), paired = FALSE, p.adjust.method = "holm")

Error: Strings must match column names. Unknown columns: !!as.name(grouping_var)
A.TJE
  • 13
  • 4

1 Answers1

0

If you want to use srings in formulas you can use reformulate.

Your example data has too few observations to work with a double grouped pairwise_t_test, but the general notation would be as follows:

library(dplyr)
library(rstatix)

outcome_var <- "Sepal.Length"
grouping_var <- "Species"

iris %>% 
  pairwise_t_test(., reformulate(grouping_var, outcome_var), paired = FALSE, p.adjust.method = "holm")
#> # A tibble: 3 x 9
#>   .y.       group1   group2     n1    n2        p p.signif    p.adj p.adj.signif
#> * <chr>     <chr>    <chr>   <int> <int>    <dbl> <chr>       <dbl> <chr>       
#> 1 Sepal.Le… setosa   versic…    50    50 8.77e-16 ****     1.75e-15 ****        
#> 2 Sepal.Le… setosa   virgin…    50    50 2.21e-32 ****     6.64e-32 ****        
#> 3 Sepal.Le… versico… virgin…    50    50 2.77e- 9 ****     2.77e- 9 ****

Created on 2020-07-08 by the reprex package (v0.3.0)

TimTeaFan
  • 17,549
  • 4
  • 18
  • 39
  • 1
    Thank you so much this works perfectly and such a simple answer! always spend so long on a problem to find theres one simple function as the answer agh! – A.TJE Jul 08 '20 at 22:34