Possible approach to the problem - take a single row from every 10-minute interval (freq='10T'
means 10 minutes).
import pandas as pd
# sample data
import numpy as np
df = pd.DataFrame({
'shipcode': [1] * 5 + [2] * 5 + [1] * 5 + [2] * 5,
'time': np.array([1.01e18 + 8.3e10 * val for val in range(20)]).astype('datetime64')})
# print(df)
# floor 'time' values with a step of 10 minutes
df['floored_time'] = df['time'].dt.floor(freq='10T')
# groupby to remove duplicates of 'shipcode', 'floored_time' pairs
dfg = df.groupby(['shipcode', 'floored_time']).agg({'time': 'min'})
# construct resulting table
dfg['shipcode'] = dfg.index.get_level_values(0)
ans = dfg[['shipcode', 'time']].sort_values(by='time').reset_index(drop=True)
# print(ans)
Sample data:
shipcode time
0 1 2002-01-02 19:33:20
1 1 2002-01-02 19:34:43
2 1 2002-01-02 19:36:06
3 1 2002-01-02 19:37:29
4 1 2002-01-02 19:38:52
5 2 2002-01-02 19:40:15
6 2 2002-01-02 19:41:38
7 2 2002-01-02 19:43:01
8 2 2002-01-02 19:44:24
9 2 2002-01-02 19:45:47
10 1 2002-01-02 19:47:10
11 1 2002-01-02 19:48:33
12 1 2002-01-02 19:49:56
13 1 2002-01-02 19:51:19
14 1 2002-01-02 19:52:42
15 2 2002-01-02 19:54:05
16 2 2002-01-02 19:55:28
17 2 2002-01-02 19:56:51
18 2 2002-01-02 19:58:14
19 2 2002-01-02 19:59:37
Result:
shipcode time
0 1 2002-01-02 19:33:20
1 2 2002-01-02 19:40:15
2 1 2002-01-02 19:47:10
3 1 2002-01-02 19:51:19
4 2 2002-01-02 19:54:05