I just scraped some of your data to show you that you can find points on the whole dataset, without using a sliding window (but you could, in theory):
- Local extrema (find peaks in raw data)
- Max Steepness (find peaks in 1st derivative)
- Inflexion points (find peaks in 2nd derivative)
First, let's have a look on calculating the derivatives:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("Default Dataset.csv",
sep=';',
decimal=",",
header=None)
### Interpolate linearily ###
x_new = np.linspace(0, df[0].iloc[-1], 2000)
y_new = np.interp(x_new, df[0], df[1])
### First and second derivative ###
diff1 = np.insert(np.diff(y_new), 0, 0)
diff2 = np.insert(np.diff(diff1), 0, 0)
### Plot everything ###
plt.figure(figsize=(12,3))
plt.subplot(131)
plt.plot(x_new, y_new)
plt.subplot(132)
plt.plot(x_new, diff1)
plt.subplot(133)
plt.plot(x_new, diff2)
plt.tight_layout()
Here, I also interpolate to have an equal spacing between the datapoints.
Further, I insert a 0
at position 0
using the np.insert
function after the differentiation, to ensure the same shape as the raw data.

Next, we will find the peaks:
import peakutils as pu
ix_abs = pu.indexes(y_new, thres=0.5, min_dist=15)
ix_diff1 = pu.indexes(diff1, thres=0.5, min_dist=15)
ix_diff2 = pu.indexes(diff2, thres=0.5, min_dist=15)
plt.scatter(x_new[ix_abs], y_new[ix_abs], color='g', label='abs')
plt.scatter(x_new[ix_diff1], y_new[ix_diff1], color='r', label='first deriv')
plt.scatter(x_new[ix_diff2], y_new[ix_diff2], color='purple', label='second deriv')
plt.plot(x_new, y_new)
plt.legend(loc='best')

I am using the peakutils package, because it works nicely in almost all cases. You see that not all points that were indicated in your example were found. You can play around with different parameters for threshold
and minimum distance
to find a better solution. But this should be a good starting point for further research. Indeed, the minimum distance
parameter would give you the desired sliding window.