Questions tagged [ecdf]

Empirical Cumulative Distribution Function in statistics

For definition please see its Wikipedia page.

In software, a built-in function ecdf takes a vector of samples and generates its ECDF. It is also easy to produce it ourselves, as given in this example: How to derive an ecdf function?

162 questions
1
vote
1 answer

Adding points on ECDF without external calculations

I would like to highlight some points on a ggplot with multiple ECDFs by specifying an aesthetic attribute. I tried the following: iris$dot <- ifelse(iris$Sepal.Length < 6, "<", ">") ggplot(iris, aes(x = Sepal.Length, col = Species)) + …
Ndr
  • 550
  • 3
  • 15
1
vote
2 answers

ecdf plot error in ggplot2: unknown color name

I get an error with a very simple ecdf plot in ggplot. Error in color name. library(ggplot2) ggplot(iris) + stat_ecdf(aes(x = Sepal.Length, col = Species), geom = "point") Error: Unknown colour name: setosa
Forge
  • 1,587
  • 1
  • 15
  • 36
1
vote
1 answer

How to calculate the largest distance between two cumulative sample distributions in Python?

Assume there are two 1D Numpy array samples with the same length, X1 and X2. After converting each of the two samples separately into accumulative density distribution, how to calculate the largest distance between the two cumulative sample…
1
vote
1 answer

How can I fix the runtime error in ecdf function in R?

When I run this code- a<- read.delim(file.choose("data.txt")) d<-sort(a$d) plot(d, sort(ecdf(d)(d)),type="s", lty=2,col="red", ylab= "P(X<=x)",ylim= 0:1) it makes me make this mistake- Error in ecdf(d) : 'x' must have 1 or more non-missing…
John John
  • 15
  • 6
1
vote
1 answer

Making ECDF plot in GGplot using double aes

I kind of stuck how to obtain ECDF (line and point combined together) plot using more than 1 aes (could be color, linetype for geom_line or shape for geom_point). So, I have this code for example data<-mtcars data$cyl<-as.factor(data$cyl) …
1
vote
1 answer

R function ecdf

xs <- seq(floor(min(fheight)),ceiling(max(fheight)),0.01) plot(xs, ecdf(fheight)(xs),type = "l", xlab = "height in inches",ylab = "F(x)") What is the purpose of (xs) in ecdf(fheight)(xs)
1
vote
1 answer

Is there a way I could plot t = 300, 350, 450, and 500 lines in one graph?

enter image description hereI wanted to plot multiple lines in one graph but I couldn't figure out which code to use. Also, is there a way I could assign colors to each of the lines? Just new to Rstudio and was assigned to pick up someones work so…
trix
  • 11
  • 2
1
vote
2 answers

Create a table with values from ecdf graph

I am trying to create a table using values from an ecdf plot. I've recreated an example below. #Data data(mtcars) #Sort by mpg mtcars <- mtcars[order(mtcars$mpg),] #Make arbitrary ranking variable based on mpg mtcars <- mtcars %>% mutate(Rank =…
4redwood
  • 365
  • 2
  • 13
1
vote
1 answer

Can you customize the plot generated from the MATLAB function 'ecdf'?

I find that I can easily customize when I use the 'plot' function, but I cannot seem to do the same things with the 'ecdf' function. Are you able to customize the way ecdf is displayed? Mainly, I would like the linewidth to be thicker so people…
1
vote
2 answers

Set weights for ewcdf {spatstat} [R]

I want to compare a reference distribution d_1 with a sample d_2 drawn proportionally to size w_1 using the Kolmogorov–Smirnov distance. Given that d_2 is weighted, I was considering accounting for this using the Weighted Empirical Cumulative…
Gion Mors
  • 313
  • 1
  • 3
  • 20
1
vote
1 answer

How to find the multivariate empirical cumulative distribution function (CDF) in R?

I have two correlated variables x and y, and I wonder how to find their empirical joint CDF in R? Also, how can we find probabilities like: P(X<=2 and Y<=3), P(X>=2 and Y>=3), P(X>=3 and Y<=2), P(X<=3 and Y>=2); P(X<=2 or Y<=3), P(X>=3 or Y>=2),…
Yang Yang
  • 858
  • 3
  • 26
  • 49
1
vote
0 answers

R memory puzzle on ECDF environments

I have a massive list of ECDF objects. Similar to: vals <- rnorm(10000) x <- ecdf(vals) ecdfList <- lapply(1:10000, function(i) ecdf(vals)) save(ecdfList, file='mylist.rda') class(ecdfList[[1]]) [1] "ecdf" "stepfun" "function" Let's quit the…
dave gibbs
  • 41
  • 4
1
vote
0 answers

Even display of unevenly spaced numbers on x/y coordinates

Would you advise on how I could make an even display of unevenly spaced number on a graph. For example, considering the code below : BREAKS = c(0, 0.1, 1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500) a <- seq(0,100,0.1) b <-…
Bogdan
  • 345
  • 1
  • 16
1
vote
1 answer

R - calculate probability and flip x/y axis of cumulative curve (ECDF)

In R I plot a cumulative curve using the ecdf function to show area vs. elevation. By default the elevation is plotted on the x axis, the area on the y axis, where elevation is given in total values (eg. 1000-3000m) and the area in probability…
the_chimp
  • 205
  • 4
  • 18
1
vote
0 answers

Matlab `quantile` doesn't interpolate between sample values on ECDF?

According to Matlab's help, quantile interpolates linearly between points on the empirical cumulative distribution function (ECDF). Importantly, the points interpolated between are the mid-points of the risers at each step. I'm finding the actual…
user36800
  • 2,019
  • 2
  • 19
  • 34