0

I have the following issue with labeling the smaple size in my dataset:

library(ggplot2)
library(EnvStats)
library(tidyverse)

cars_pre= mtcars %>% mutate(time="Pre")
cars_post= mtcars %>% mutate(time="Post")

df= rbind(cars_pre,cars_post)
p <- ggplot(df,             aes(x = factor(cyl), y = mpg, fill = time) + 
  theme(legend.position = "none"))

ggplot(df,  aes( factor(cyl),  mpg)) + geom_boxplot(aes(fill = time) ) + 
  stat_n_text()

The issue I have is that I have pre/post groups: but those are the same subjects, so when I plot stat_n_text it doubles my sample size, cause it adds pre and post samples together ( treats them as separate subjects). Is there a way to update the n, so it will be half of its value (in a picture I want n =11, 7 and 14 for cyl 4,5, and 6 respectively)?

enter image description here

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
yuliaUU
  • 1,581
  • 2
  • 12
  • 33

2 Answers2

0

I figured the following way how to modify it: by changing figure data:

    library(splitstackshape)
    p<-ggplot(df,  aes( factor(cyl),  mpg)) + 
geom_boxplot(aes(fill = time) ) + 
          stat_n_text( )
    

Then extract data from the ggplot object:

    q <- ggplot_build(p)

Modify label of the sample size data

    q$data[[2]]= q$data[[2]] %>% cSplit(., 'label', '=') %>% mutate(label=paste0(label_1, "=", label_2/2))

Plot the object back q <- ggplot_gtable(q) plot(q)

enter image description here

yuliaUU
  • 1,581
  • 2
  • 12
  • 33
0

According to packaged developers: a bit of a hack, but a direct manipulation of N is not implemented:

ggplot(df, aes( factor(cyl), mpg)) + geom_boxplot(aes(fill = time) ) + stat_n_text(data=cars_pre)
yuliaUU
  • 1,581
  • 2
  • 12
  • 33