5

I am trying to add the labels for extreme values (outliers or not) on geom_box plots. I found this question, which is almost exactly similar to mine [ extreme value labels ggplot2 in geom_boxplot ] The answer provided by yonicd almost works for me:

df=rbind(data.frame(id=rep("1",100),var=paste0("V",seq(1,100)),
         val=rnorm(100,0,5)),
         data.frame(id=rep("2",100),var=paste0("V",seq(1,100)),
         val=rnorm(100,0,3)))


df_bound=df%.%group_by(id)%>%do(.,data.frame(val=boxplot.stats(.$val)$out))
df_bound=left_join(df_bound, df, by=c("id","val"))

ggplot(df,aes(x=id, y=val, fill=id, label=var)) + geom_boxplot() +
geom_point(aes(group=id), data=df_bound)+
geom_text(aes(group=id), data=df_bound, hjust=-1, size=4)

It seems obvious that I would just need to replace the [ $out] in

 df_bound=df%.%group_by(id)%>%do(.,data.frame(val=boxplot.stats(.$val)$out))

to have the extreme values instead of the outliers. If I use

df_bound=df%.%group_by(id)%>%do(.,data.frame(val=boxplot.stats(.$val)$stats))

the labels for the outliers don't appear. How could if fix that?

Community
  • 1
  • 1
Hugo
  • 63
  • 6
  • 1
    I reckon it's because the conf values ("the lower and upper extremes of the ‘notch’") are calculated (if I got that correctly from `?boxplot.stats`) and not values found in `df`, hence the join cannot work. – erc May 22 '15 at 06:49
  • Thanks. I corrected the post accordingly. If I use [ $stats ] instead, the labels are the extreme values of the whiskers. – Hugo May 23 '15 at 07:35

1 Answers1

0

So code works with latest version of dplyr: change the %.% operator to a magrittr pipe %>%. Added [c(1,5)] which subsets the 'lower' and 'upper' extremes of the whiskers in boxplot.stats$stats.

df_bound <- df%>%group_by(id)%>%do(.,data.frame(val=boxplot.stats(.$val)$stats[c(1,5)]))

Peter
  • 11,500
  • 5
  • 21
  • 31