I am trying to find a solution to the following problem. I have a set of points which should model a sum of 2 Gaussian functions centered at different points. I need to find these two points. Up to now my approach has been to find the centroid of the whole set and cut the set of date below and above it. Then I calculate the centroid of each piece and those are my centers. This approach however cuts the information of, say, the left Gaussian which leaks into the right half of the data. This makes the procedure fail when the Gaussians are close together. Is there way to do this more intelligently? Due to the computational difficulty I would prefer if the solution didn't involve curve fitting.
Asked
Active
Viewed 894 times
4

Severin Pappadeux
- 18,636
- 3
- 38
- 64

Ivan Burbano
- 239
- 1
- 3
- 11
-
I have a Python example of fitting two Lorentzian peaks to Raman spectroscopy of carbon nanotubes using a genetic algorithm to automatically supply initial parameter estimates here, replace the data and equations with your own and it might work rather easily: https://bitbucket.org/zunzuncode/RamanSpectroscopyFit – James Phillips Oct 16 '18 at 11:03
-
For frequency finding in FFTs there is a method using Gaussian window functions. In logarithmic scale you can calculate the centre from two consecutive bins. This, however works for a singe peak only and is better if close to the max. You could try more off from the peak. But if two Gaussians mix strongly, it is hard to see how to avoid fitting. I don't see a computational difficulty, though. – mikuszefski Oct 16 '18 at 14:40
-
2This is a mixture of Gaussians model. The conventional way to find the parameters of this kind of model is the so-called expectation-maximization algorithm, which is just a generalization of the cutting and fitting procedure you have already. A web search for "mixture of Gaussians expectation-maximization" should find a lot of resources. There are packages in many programming languages which implement this algorithm. – Robert Dodier Oct 16 '18 at 16:26
1 Answers
0
As the OP does not show any data, it is not clear how noisy the data is. Furthermore, it is not clear how "close together" is defined here. In the following I have a simple approximation that works with low noise and the assumption that the left hand side data is dominated by the left Gaussian, while the right hand side is dominated by the right Gaussian. This gives some restrictions to position, height, and especially standard deviation.
It surely works for a single peak, but is quite OK for mixed double peaks ( within the above mentioned restrictions )
#!/usr/bin/python
import matplotlib.pyplot as plt
import numpy as np
def gaussian( x, x0, s, a):
return a * np.exp( -( x - x0 )**2 / ( 2 * s**2 ) ) / np.sqrt( 2 * np.pi * s**2 )
def get_x0( x1, x2, x3, y1, y2, y3 ):
l12= np.log( y1 / y2 )
l13= np.log( y1 / y3 )
return ( ( x2 + x1 )/2. - ( x3 + x1 )/2. * l12/l13 * ( x3 - x1 ) / ( x2 - x1 ) ) / ( 1 - l12 / l13 * (x3 - x1 ) / ( x2 - x1 ) )
fig = plt.figure( )
ax = fig.add_subplot( 2, 1, 1 )
xL = np.linspace(-8, 8, 150 )
yL = np.fromiter( ( gaussian( x,-2.1, 1.2, 8 ) for x in xL ), np.float )
marker=[10,15,20]
x1 = xL[ marker[0] ]
x2 = xL[ marker[1] ]
x3 = xL[ marker[2] ]
y1 = yL[ marker[0] ]
y2 = yL[ marker[1] ]
y3 = yL[ marker[2] ]
print get_x0( x1, x2, x3, y1, y2, y3 )
ax.plot( xL, yL )
ax.scatter( [ x1, x2, x3 ],[ y1, y2, y3 ])
bx = fig.add_subplot( 2, 1, 2 )
yL = np.fromiter( ( gaussian( x,-2.1, 1.2, 8) + gaussian( x,0.7, 1.4, 6 ) for x in xL ), np.float )
marker=[10,15,20]
x1 = xL[ marker[0] ]
x2 = xL[ marker[1] ]
x3 = xL[ marker[2] ]
y1 = yL[ marker[0] ]
y2 = yL[ marker[1] ]
y3 = yL[ marker[2] ]
bx.scatter( [ x1, x2, x3 ],[ y1, y2, y3 ])
print get_x0( x1, x2, x3, y1, y2, y3 )
marker=[-20,-25,-30]
x1 = xL[ marker[0] ]
x2 = xL[ marker[1] ]
x3 = xL[ marker[2] ]
y1 = yL[ marker[0] ]
y2 = yL[ marker[1] ]
y3 = yL[ marker[2] ]
bx.scatter( [ x1, x2, x3 ],[ y1, y2, y3 ])
print get_x0( x1, x2, x3, y1, y2, y3 )
bx.plot( xL, yL )
plt.show()
Shows:
#Single
-2.0999999999999455
#Double
-2.0951188129317813
0.6998760921436634
which is pretty close to -2.1
and 0.7
In case of noise some averaging might be required.

mikuszefski
- 3,943
- 1
- 25
- 38