I'm currently writing a script that is supposed to remove redundant data points from my graph. My data includes overlaps from adjacent data sets and I only want the data that is generally higher. (Imagine two Gaussians with an x offset that overlap slightly. I'm only interested in the higher values in the overlap region, so that my final graph doesn't get all noisy when I combine the data in order to make a single spectrum.)
Here are my problems:
1) The x values aren't the same between the two data sets, so I can't just say "at x, take max y value". They're close together, but not equal.
2) The distances between x values aren't equal.
3) The data is noisy, so there can be multiple points where the data sets intersect. And while Gaussian A is generally higher after the intersection than Gaussian B, the noise means Gaussian B might still have SOME values which are higher. Meaning I can't just say "always take the highest values in this x area", because then I'd wildly combine the noise of both data sets.
4) I have n overlaps of this type, so I need an efficient algorithm and all I can come up with is somewhere at O(n^3), which would be something like "for each overlap, store data sets into two arrays and for each combination of data points (x0,y0) and (x1,y1) cycle through until you find the lowest combination of abs(x1-x0) AND abs(y1-y0)"
As I'm not a programmer, I'm completely lost. I also wasn't able to find an algorithm for this problem anywhere - most algorithms assume that the entries in the arrays I'm comparing are equal integers, but I'm working with almost-equal floats.
I'm using IDL, but I'd also be grateful for a general algorithm or at least a tip what I could try. Thanks!