Counting number of spikes in a graph in python

Question

With dataset df I plotted a graph looking like the following:

df

Time    Temperature
8:23:04     18.5
8:23:04     19
9:12:57     19
9:12:57     20
9:12:58     20
9:12:58     21
9:12:59     21
9:12:59     23
9:13:00     23
9:13:00     25
9:13:01     25
9:13:01     27
9:13:02     27
9:13:02     28
9:13:03     28

Graph(Overall)

When zooming in the data, we can see more details:

I would like to count the number of activations of this temperature measurement device, which gives rise to temperature increasing drastically. I have defined an activation as below:

Let T0, T1, T2, T3 be temperature at time t=0,t=1,t=2,t=3, and d0= T1-T0, d1= T2-T1, d2= T3-T2, ... be the difference of 2 adjacent values.

If

1) d0 ≥ 0 and d1 ≥ 0 and d2 ≥ 0, and

2) T2- T0 > max(d0, d1, d2), and

3) T2-T0 < 30 second

It is considered as an activation. I want to count how many activations are there in total. What's a good way to do this?

Thanks.

please, mark the spikes on the graph. I could easily count 7 or 8 well defined spikes or up to 13 of smaller ones. — lenik, Feb 05 '20 at 01:53
Try this: https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks.html — run-out, Feb 05 '20 at 02:10
What is the issue, exactly? Have you tried anything, done any research? — AMC, Feb 05 '20 at 02:11
@chris I tried a way of doing this by comparing 3 differences of adjacent values and make sure they all >0. But I am not sure how accurate it is and wonder if there is a better way. — nilsinelabore, Feb 05 '20 at 02:48
@MadPhysicist thanks for the suggestion. I will have a look at it. — nilsinelabore, Feb 05 '20 at 05:07

score 1 · Accepted Answer · answered Feb 05 '20 at 02:47

There could be a number of different, valid answers depending on how a spike is defined.

Assuming you just want the indices where the temperature increases significantly. One simple method is to just look for very large jumps in value, above some threshold value. The threshold can be calculated from the mean difference of the data, which should give a rough approximation of where the significant variations in value occur. Here's a basic implementation:

import numpy as np

# Data
x = np.array([0, 1, 2, 50, 51, 52, 53, 100, 99, 98, 97, 96, 10, 9, 8, 80])

# Data diff
xdiff = x[1:] - x[0:-1]

# Find mean change
xdiff_mean = np.abs(xdiff).mean()

# Identify all indices greater than the mean
spikes = xdiff > abs(xdiff_mean)+1
print(x[1:][spikes])  # prints 50, 100, 80
print(np.where(spikes)[0]+1)  # prints 3, 7, 15

You could also look use outlier rejection, which would be much more clever than this basic comparison to the mean difference. There are lots of answers on how to do that: Can scipy.stats identify and mask obvious outliers?

Hi Michael, thanks for the answer. I just realised that I had a bit of misunderstanding about the dataset and oversimplified the problem. As such, I've edited the question and added some pseudo code. But it'd be great if you could keep your original code as I feel it provides some very good insight too. Thanks a lot. Much appreciate your contribution. — nilsinelabore, Feb 05 '20 at 05:00

Counting number of spikes in a graph in python

1 Answers1