2

I am using the scipy peakfinder scipy.signal.find_peaks_cwt to find peaks in a signal. All peaks are found reliably, but I always get additional results (so far all of them were at the end of the signal) that are not peaks. I am wondering why this happens...

Here is a complete example with synthetic data:

from scipy.signal import find_peaks_cwt
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline # for jupyter notebooks

x = np.arange(0, 15, 0.1)
y = np.sin(x)
plt.plot(y)

peakinds = find_peaks_cwt(y, np.arange(1, 5))
plt.plot(peakinds, y[peakinds], 'o')

(To run in a normal python shell, umcomment the %matplotlib inline and add a plt.show() to the end)

Plot with peaks marked as dots:

enter image description here

(The last three dots should not be there)

With my real data the same thing happens:

enter image description here

(Here the last dot is wrong)

Why is this happening?

Johannes
  • 3,300
  • 2
  • 20
  • 35

1 Answers1

4

Your widths parameter in find_peaks_cwt is the problem.

from scipy.signal import find_peaks_cwt
import numpy as np
import matplotlib.pyplot as plt

x = np.arange(0, 15, 0.1)
y = np.sin(x)
fig0 = plt.figure()
ax0 = fig0.add_subplot(111)
ax0.plot(y)

peakinds = find_peaks_cwt(y, np.arange(1, 10))  # Here changed 5 to 10
ax0.plot(peakinds, y[peakinds], 'o')
plt.axis([0, 160, -1.1, 1.1])

enter image description here

From the documentation:

widths : sequence 1-D array of widths to use for calculating the CWT matrix. In general, this range should cover the expected width of peaks of interest.

Edit:

The default wavelet used is the Ricker wavelet. Basically, a convolution is performed between the signal and the wavelet at all specified widths (by calling ricker(width[i]). Therefore, the range you give has to go from small (to precisely localize the peak) to big enough (for detecting the peaks of interest) but not too big (to avoid aliasing - let's remember here that wavelet work in the frequency domain).

Excerpt of the documentation: The algorithm is as follows: 1 - Perform a continuous wavelet transform on vector, for the supplied widths. This is a convolution of vector with wavelet(width) for each width in widths.

If you change widths by np.arange(10, 20), you will notice that the peaks are detected but their maximum is not well-localized (we are missing the fine scales). If you try again with np.arange(1, 20), the peaks are better localized.

Also, if you want to visualize the ricker wavelet:

from scipy.signal import ricker
vec = ricker(100, 10)  # (nb_of_points, frequency)
fig0 = plt.figure()
ax0 = fig0.add_subplot(111)
ax0.plot(vec)

Edit 2:

As for the extra peak erroneously detected at the end of the signal, this is most probably due to the border effect. Basically, the window for the convolution goes beyond the last sample of the signal. Usually, a padding (zero-padding, signal wrapping,...) is done on the signal but depending on how it is done (or not done at all) this kind of error can occur. Discarding the first few and last points is generally appropriate when working with these types of methods.

Eskapp
  • 3,419
  • 2
  • 22
  • 39
  • I noticed that playing around with the width can help, however I can't figure out how to get reliable values. Try `np.arange(1, 50)` for instance, then you get the second peak twice. And 50 *should* be ok, since the peaks are wider than 50. – Johannes Aug 25 '17 at 16:09
  • I specified a bit more my understanding of this function. Hope that helps. – Eskapp Aug 25 '17 at 17:09
  • That helped me understand the function a lot, thank you! Still I am not sure how to explain the last peak in my real-data plot. I was using `width=np.arange(5, 20)` – Johannes Aug 26 '17 at 18:26
  • 1
    @Johannes It's hard to tell without the data in hand... It's possible that there are some border effects at the ends of the signal. I would automatically discard results for the first and last few points. – Eskapp Aug 28 '17 at 13:28