0

Assuming we have a randomly sampled distribution, we can calculate and plot the associated ecdf as follows:

set.seed(1)
t1 <- rnorm(10000,mean=20)
t1 <- sort(t1)
t1[1:1000] <- t1[1:1000]*(-100)
t1[1001:7499] <- t1[1001:7499]*50
t1[7500:10000] <- t1[7500:10000]*100
cdft1 <- ecdf(t1)
plot(cdft1)

Now in this case, there are jumps (created by intention) in the empirical distribution. By jumps I mean, that it increases by a lot, let's say by more than 100% of the value from before. This happens in the example at position 7,500. My question is: How can I find these 'jump' indices most effectively?

user3032689
  • 627
  • 1
  • 10
  • 23

1 Answers1

2

You can get close to what you want just by looking at diff of the sorted t1 values.

St1 = sort(t1)
which(diff(St1) > abs(St1[-length(St1)]))
[1] 1000 7499

At point 1000, St1 switches from -1632.8700 to 934.6916, which technically meets your criterion of "more than 100% change". It does not seem clear to me what is wanted when there is a sign change like this.

G5W
  • 36,531
  • 10
  • 47
  • 80