0

I have a dataframe whoese strucuture is like this:

> str(mydata12)
'data.frame':   228459 obs. of  2 variables:
 $ intron_length: num  0.787 0.799 2.311 2.396 1.77 ...
 $ intron_type  : Factor w/ 3 levels "All_intron","All_retained_intron",..: 1 1 1 1 1 1 1 1 1 1 ...

I have plotted a accumulation density figure based on this dataframe:

p <- ggplot(mydata12, aes(x = intron_length, color=intron_type)) + geom_step(aes(y=..y..),stat="ecdf")

enter image description here

Now I want to make the comparison by adding p values among 3 groups:

> compare_means(intron_length~intron_type, data = mydata12)
> my_comparisons <- list(c("All_intron", "All_retained_intron"), c("All_intron", "dynamic_intron"), c("All_retained_intron", "dynamic_intron"))
> p + stat_compare_means(comparisons = my_comparisons)
Error in f(...) : 
  Can only handle data with groups that are plotted on the x-axis

I guess I need to set a value on x axis to make comparison, my question is how to set the x axix value and add the p value?

Thanks,

Kyle
  • 49
  • 7

1 Answers1

1

You can't super-impose what you want on top of what you have they are very different scales and axes but you can do this (I made up data since you didn't provide...

  library(ggpubr)
#> Loading required package: ggplot2
  library(ggplot2)
  # Pairwise comparisons: Specify the comparisons you want
  my_comparisons <- list(c("All_intron", "All_retained_intron"), c("All_intron", "dynamic_intron"), c("All_retained_intron", "dynamic_intron"))
  ggboxplot(mydata12, x = "intron_type", y = "intron_length",
            color = "intron_type", palette = "npg")+
    # Add pairwise comparisons p-value
    stat_compare_means(comparisons = my_comparisons, label.y = c(1.2, 1.3, 1.4))+
    stat_compare_means(label.y = 1.5)     # Add global Anova p-value  

mydata12 <- data.frame(intron_length = runif(1000, min = 0, max = 1), 
                       intron_type = sample(c("All_intron", "All_retained_intron" , "dynamic_intron","All_retained_intron"), size = 1000, replace = TRUE))


Chuck P
  • 3,862
  • 3
  • 9
  • 20
  • Thank you for your reply. I kind of knowing this way. Is there any way to get the slope of the curves and make a comparison? – Kyle Jul 02 '20 at 02:47
  • May I suggest you take your question over to stats exchange and look for questions like this one https://stats.stackexchange.com/questions/115132/compare-distributions-of-two-ecdfs – Chuck P Jul 02 '20 at 11:53
  • Thank you very much for reply. I have seen comparison between 2 groups of accumulation density curve. Like :https://reader.elsevier.com/reader/sd/pii/S0092867415008958?token=487AE85D2B79BEC4CDEBDC56F9747B2F8E10E77803611620C145B893369B252CA27288C4A7F1FBAE8A6552FA531C13E9 (Figure1E). The method they used is:"For comparison of inter-spine distance distribution, the Kolmogorov-Smirnov two-sample-test was used. Linear regressions were performed in GraphPad Prism 5. Data distribution was assumed to be normal although not formally tested". So I am wondering if I can do the same on my data – Kyle Jul 02 '20 at 13:52