4

I'm trying to get my empirical cumulative density curves to have their backgrounds filled but can't seem to get it going.

I've tried the following two approaches, the first which appears to change the alpha of the curve and not the fill

ggplot( myDataFrame , aes( x=myVariable , fill=myFactor ) )                       +
geom_step      ( stat="ecdf" , aes( colour = myFactor  ) , alpha=1 )              +
coord_cartesian( xlim = c( 0 , quantile( myDataFrame$myVariable , prob=0.99 ) ) ) +
facet_grid     ( myFactor ~ . , scales="free_y" )

The second appears to be equilavent to the above

ggplot( myDataFrame , aes( x=myVariable , fill=myFactor ) )                       +
stat_ecdf      ( aes( colour = myFactor ) ,alpha=0.2 )                            +
coord_cartesian( xlim = c( 0 , quantile( myDataFrame$myVariable , prob=0.99 ) ) ) +
facet_grid     ( myFactor ~ . , scales="free_y" )

I wonder too if it will fill the 100% across the full xlim for factor levels which saturate early.

Empirical Cumulative Densities across Factor level subsets

jxramos
  • 7,356
  • 6
  • 57
  • 105

1 Answers1

10

Something like this?

library(ggplot2)
set.seed(1)
df <- data.frame(var=c(rnorm(1000),rpois(1000,25),rgamma(1000,25)),
                 fact=rep(c("N","P","G"),each=1000))
ggplot(df,aes(x=var,fill=fact))+
  stat_ecdf(aes(ymin=0,ymax=..y..),geom="ribbon")+
  stat_ecdf(geom="step")+
  facet_grid(fact~.,scales="free_y")

Edit: (Response to comment below).

The notation ..y.. provides access to whatever is mapped to ggplot's y-aesthetic.

The stat_*(...) functions create an implicit variable that maps to the y-aesthetic. For stat_ecdf(...) it's the ecdf: the fraction of observations in x that are less than or equal to the given x. ggplot will automatically map this internal variable to the y-aesthetic.

But in this case we are using the ribbon geometry, which requires ymax and ymin aesthetics, not y. So by setting ymax=..y.. we're telling ggplot to map the ymax-aesthetic to whatever is mapped to y.

jlhoward
  • 58,004
  • 7
  • 97
  • 140
  • Any chance you could explain the `..y..` notation in ymax? I haven't seen this in ggplot before and it isn't mentioned in the stat_ecdf documentation. I played around with your example code and it only works with that argument. Thanks. – SubstantiaN Apr 28 '18 at 20:34