0

I have a dataframe with different id and possible overlapping time with the time step of 0.4 second. I would like to resample the average speed for each id with the time step of 0.8 second.

    time  id  speed
0    0.0   1      0
1    0.4   1      3
2    0.8   1      6
3    1.2   1      9
4    0.8   2     12
5    1.2   2     15
6    1.6   2     18

An example can be created by the following code

x = np.hstack((np.array([1] * 10), np.array([3] * 15)))
a = np.arange(10)*0.4
b = np.arange(15)*0.4 + 2
t = np.hstack((a, b))

df = pd.DataFrame({"time": t, "id": x})
df["speed"] = pd.DataFrame(np.arange(25) * 3)

The time column is transferred to datetime type by

df["re_time"] = pd.to_datetime(df["time"], unit='s')

WZhao
  • 303
  • 4
  • 12

1 Answers1

0

Try with groupby:

block_size = int(0.8//0.4)

blocks = df.groupby('id').cumcount() // block_size

df.groupby(['id',blocks]).agg({'time':'first', 'speed':'mean'})

Output:

      time  speed
id               
1  0   0.0    1.5
   1   0.8    7.5
   2   1.6   13.5
   3   2.4   19.5
   4   3.2   25.5
3  0   2.0   31.5
   1   2.8   37.5
   2   3.6   43.5
   3   4.4   49.5
   4   5.2   55.5
   5   6.0   61.5
   6   6.8   67.5
   7   7.6   72.0
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
  • Thanks. I tried many ways to use `resample` and `groupby`, but cannot get the result that I expect. – WZhao Feb 18 '21 at 16:55