0

I am trying to create a script using which I can automate the creation of a set of plots (faceted and grouped) with p-values calculated and plotted using the ggpubr and rstatix packages.

set.seed(1234)

create the dataset
data_set <- 
  data.frame(
    var1 = rep(c("N", "N", "Y", "Y"),4),
    var2 = c(rep("type1",8), rep("type2", 8)),
    var3 = c(rep("type1",4),rep("type2",8),rep("type1",4)),
    x = rnorm(16),
    y = rnorm(16),
    z = rnorm(16)
    )
Perform t test for variable xvs. var2 grouped by var3 and faceted by var1 (see below) and store the results as a dataframe using rstatix functions
stat.test <- data_set %>%
 group_by(var2, var1) %>%
 t_test( x ~ var3) %>%
 adjust_pvalue(method = "bonferroni") %>%
 add_significance("p.adj") %>%
 add_xy_position(x = "var2", dodge = 0.8)

perform another t-test on variable x vs. var3 this time using data grouped by var2 and faceted again by var1 and perform a mutate to alter some variables so they align correctly when plotted using the function below.
stat.test.1 <- data_set %>%
  group_by(var3, var1) %>%
  t_test( x ~ var2) %>%
  adjust_pvalue(method = "bonferroni") %>%
  add_significance("p.adj") %>%
  add_xy_position(x = "var3", dodge = 0.8) %>%
  mutate(
    xmin = xmin + c(0, 0, -0.6, -0.6),
    xmax = xmax + c(0.6, 0.6, 0, 0),
    y.position = y.position + c(1, 1, 2, 2)
  )
Plot using ggboxplot
ggboxplot(
  data_set,
  x = "var2",
  add = "mean_sd",
  y = "x",
  color = "var3",
  facet.by = "var1"
) +
  stat_pvalue_manual(stat.test,
                     label = "p.adj",
                     tip.length = 0.01,
                     hide.ns = FALSE) +
  stat_pvalue_manual(
    stat.test.1,
    label = "p.adj",
    tip.length = 0.01,
    hide.ns = FALSE
  ) +
  scale_y_continuous(expand = expansion(mult = c(0.01, 0.1)))

All of this works to my expectations and I get the plot I want along with significance values plotted (though not perfect, needs some adjustments to the y positions of the significance bars).

enter image description here

What I want to do is create a function or script using tidy approach to create a similar set of boxplots for all numeric variables (x, y and z) grouped and faceted in the same manner as this plot. I am able to get the plots themselves, but having difficulty with generating the stats dataframes and using them to add the p values and significance bars into the plots. Thanks.

jaydoc
  • 79
  • 1
  • 7
  • Do you want to include a dataframe in the plot? What values should the data contain? – Ronak Shah Mar 29 '21 at 10:39
  • I don’t want to include the dataframes themselves in the plots. The stats dataframes are to provide the p values and the guides for where the significance bars should start and end in the plots (edited question). Thanks. – jaydoc Mar 29 '21 at 12:42

0 Answers0