0

Before you ask, yes I did a thorough search, including in StackExchange. I checked out this question Find local minimum in bimodal distribution with r; it has useful stuff, but doesn't go all the way for me.

Here is my data: https://www.dropbox.com/s/ulsw5cfwjnh6tdu/output?dl=0

Actually, I have thousands of files like this - the data is organized into column vectors that I need to plot. Then, for every plot I have to figure out if the distribution is bimodal, and if so, find the x value for that minimum. In the file above, the data in column 2 seems to have a bimodal distribution.

This code seems to work, but I have to manually specify where the minimum is located:

datafile <- read.table("output")
data <- datafile$V2
d <- density(data) # returns the density data with defaults
hist(data,prob=TRUE)
lines(d) 
optimize(approxfun(d$x,d$y),interval=c(5,10))$minimum

The last statement (I got it from Find local minimum in bimodal distribution with r) returns

[1] 6.841577

which seems to be correct. However, I have to hardcode the interval interval=c(5,10), i.e. I have to look at the graph and tell it where to look for the minimum. If I do

optimize(approxfun(d$x,d$y),interval=range(x$V2))$minimum

I get

[1] 25.30096

which is not what I'm looking for.

I have thousands of plots, so this is not an option. Is there any way to automate the process? Please help, I'm absolutely desperate.

P.S. I don't have enough reputation to post the plot. Sorry.

Community
  • 1
  • 1
Stefan
  • 221
  • 1
  • 4
  • 13
  • what if you change interval to `interval=c(0,mean(data))`? I get 6.8-whatever using that – rawr Nov 03 '14 at 04:03
  • Thanks for the input rawr. I'm not sure if this is a universal solution, though. What if the small peak is to the right of the large one? What if they're about equal? – Stefan Nov 03 '14 at 04:17
  • 1
    okay how about `min(kmeans(data, centers = 2)$centers)` – rawr Nov 03 '14 at 04:48
  • Thanks rawr, this looks really helpful, but when I try it on column 4 in this file, I get 34. something, which doesn't seem quite right. https://www.dropbox.com/s/ulsw5cfwjnh6tdu/output?dl=0 – Stefan Nov 03 '14 at 05:15
  • that one doesn't look bimodal. trying to make it give you two centers from a nice distribution will return some weird results – rawr Nov 03 '14 at 05:27
  • A couple other thoughts: I like `pastecs::turnpoints` for finding all peaks/valleys in a sequence. In general, for doing a deeper search for `R` functions, grab the `sos` package. – Carl Witthoft Nov 03 '14 at 12:23

0 Answers0