Questions tagged [ecdf]

Empirical Cumulative Distribution Function in statistics

For definition please see its Wikipedia page.

In software, a built-in function ecdf takes a vector of samples and generates its ECDF. It is also easy to produce it ourselves, as given in this example: How to derive an ecdf function?

162 questions
2
votes
0 answers

Plot ECDF without loading all data in memory

I need to plot the ECDF of some data. I found out I could do it with ecdf = sm.distributions.ECDF(sample) x = np.linspace(min(sample), max(sample)) y = ecdf(x) plt.step(x, y) using the matplotlib and statsmodels Python packages. My problem is…
2
votes
1 answer

Plot ecdf and density in the same plot and zoom in to specific part

I want to plot the density and ecdf in a same plot using ggplot2. I wrote a code here library(ggplot2) library(reshape) set.seed(101) var1 = rnorm(1000, 0.5) var2 = rnorm(100000,0.5) combine = melt(data.frame("var1" = var1,"var2"=…
user3978632
  • 283
  • 4
  • 17
2
votes
1 answer

CDF beyond range of values in R ggplot2

I am trying to plot the CDF using ggplot2 in R and I get the following plot But the min and max values of the data are 1947 and 2017. I do not want the line to be plot beyond the ranges [1947, 2017]. ggplot(df, aes(x=year)) + stat_ecdf(geom="line")…
Dinesh
  • 2,194
  • 3
  • 30
  • 52
2
votes
1 answer

How do I scales the axes to the larger vector when plotting two ecdfs for comparison in R?

Initially I start out with 2 vectors (subsets of my data). I run ecdf on both, plot them in the same plot for ease of comparison. All of that is fine but what I need to know is how to make the function work universally for any pair of vectors, so I…
m9000
  • 55
  • 1
  • 7
2
votes
1 answer

data.table + ecdf - undefined column

I am working with data.table. It is easy to select a column from a data.table object: > head(data.table(mtcars)[,2]) cyl 1: 6 2: 6 3: 4 4: 6 5: 8 6: 6 But trying to select a column using this syntax within a ecdf call yields an…
hartmut
  • 934
  • 12
  • 25
2
votes
1 answer

Create a ggplot2 stat_ecdf plot with standard error shading

I have data from three doses of a treatment, with three replicates per each dose: df <-…
dan
  • 6,048
  • 10
  • 57
  • 125
2
votes
2 answers

Plot ECDF data with ggplot2

I've a normalize data to plot ecdf but I couldn't change line shape, color and legend info. My Data is: EDCF.df <- structure(list(Length = c(11431L, 138250L, 109935L, 7615L, 5221L, 8741L, 9460L, 3102L, 2662L, 12286L, 5097L,…
eabanoz
  • 251
  • 3
  • 17
2
votes
1 answer

Generating a stacked cumulative smooth frequency distribution plot

I have data in which two types of occurrences are registered: type_a and type_b and their year of occurrence. This is one way to generate an example of my data: set.seed(1) years <- 1991:2010 type_a_years <- 20 type_b_years <- 10 type_a <-…
dan
  • 6,048
  • 10
  • 57
  • 125
2
votes
1 answer

ggplot2 ecdf faceting for subsets + overall ecdf in each panel

I have a data set with a continuous variable and a factor with n levels. I'd like to plot an empirical cumulative distribution function for each level separately plus an overall ecdf in each of the panels. The point is to compare the subsets'…
kategorically
  • 157
  • 2
  • 11
2
votes
2 answers

How to find in which quantile bin does a number fall

I know how to find quantile of an empirical distribution. set.seed(1) x = rnorm(100) q = quantile(x, prob=seq(0,1,.01)) Is there a function that would give me the quantile bin a number of the training set belongs to ? In this example R) x[1] [1]…
statquant
  • 13,672
  • 21
  • 91
  • 162
2
votes
3 answers

How to plot CCDF graph on a logarithmic scale?

I want to plot a CCDF graph for some of my simulated power-law tail data on a log-log axis, below is my R code of plotting a CCDF graph on a normal axis, I used the code on the link: (How to plot a CCDF gragh?) > load("fakedata500.Rda") >…
user3579282
  • 45
  • 2
  • 9
2
votes
1 answer

Plot density and cumulative density function in one combined plot using ggplot2

I would like to get a plot that combines the density of observations and the cdf. The usual problem with that is that the scales of the two are way off. How can this be remedied, i.e., two scales be used or, alternatively, one of the data series be…
Peter Lustig
  • 941
  • 11
  • 23
2
votes
1 answer

R apply ecdf to every column

I am trying to get the ecdf() of each column c in a data frame and then feed the rows r of a column into that column's ecdf(c) to return the corresponding value ecdf(c)(r) per cell. The function below works. getecdf <- function(data){ # initialise…
Zhubarb
  • 11,432
  • 18
  • 75
  • 114
2
votes
1 answer

Empirical Quantile Comparison Effect Size

I'm trying to recreate the following integral with empirical data: where F, G are cdfs and their inverses are quantile functions. Here's my code: def eqces(u,v): import numpy as np import statsmodels.api as sm from scipy.stats.mstats…
2
votes
1 answer

How to plot estimated CDF with empirical CDF

I am fitting a distribution to a given data.then I have estimated parameters of the distribution.I have also plotted the empirical cdf using ecdf() command in R. Now I have to plot cdf of estimated distribution along with empirical cdf. How I can do…
user2881894
  • 21
  • 1
  • 2
1 2
3
10 11