Matching peaks in similar spectra in python

Question

I have a series of many thousands of (1D) spectra corresponding to different repetitions of an experiment. For each repetition, the same data has been recorded by two different instruments - so I have two very similar spectra, each consisting of a few hundred individual peaks/events. The instruments have different resolutions, precisions and likely detection efficiencies so the each pair of spectra are non-identical but similar - looking at them closely by eye one can confidently match many peaks in each spectra. I want to be able to automatically and reliably match the two spectra for each pair of spectra, i.e confidently say which peak corresponds to which. This will likely involve 'throwing away' some data which can't be confidently matched (e.g only one of the two instruments detect an event).

I've attached an image of what the data look like over an entire spectra and zoomed into a relatively sparse region. The red spectra has essentially already been peak found, such that it is 0 everywhere apart from where a real event is. I have used scipy.signal.find_peaks() on the blue trace, and plotted the found peaks, which seems to work well.

Now I just need to find a reliable method to match peaks between the spectra. I have tried matching peaks by just pairing the peaks which are closest to each other - however this runs into significant issues due to some peaks not being present in both spectra. I could add constraints about how close peaks must be to be matched but I think there are probably better ways out there. There are also issues arising from the red trace being a lower resolution than the blue. I expect there are pattern finding algorithms/python packages out there that would be best suited for this - but this is far from my area of expertise so I don't really know where to start. Thanks in advance.

Zoom in of relatively spare region of example pair of spectra :

An entire example pair of spectra, showing some very dense regions :

Example code to generate to plot the spectra:

from scipy.signal import find_peaks


for i in range(0, 10):

    spectra1 = spectra1_list[i]
    spectra2 = spectra2_list[i]
    fig, ax1 = plt.subplots(1, 1,figsize=(12, 8))
    peaks, properties = scipy.signal.find_peaks(shot_ADC, height=(6,None), threshold=(None,None), distance=2, prominence = (5, None))
    plt.plot(spectra1)
    plt.plot(spectra2_axis, spectra2,  color='red')
    plt.plot(peaks, spectra1[peaks], "x")
    plt.show()

[This paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4529286/pdf/nihms428459.pdf) may be of interest. — DrBwts, May 17 '19 at 14:40

score 2 · Answer 1 · answered May 17 '19 at 14:13

Deep learning perspective: you could train a pair of neural networks using cycle loss - mapping from signal A to signal B, and back again should bring you to the initial point on your signal.

Good start would be to read about CycleGAN which uses this to change style of images.

Admittedly this would be a bit of a research project and might take some time until it will work robustly.

Matching peaks in similar spectra in python

1 Answers1