Doing calculations on higher frequency data in lower frequency bins in Pandas

Question

I have some data in a pandas dataframe that has entries at the per-second level over the course of a few hours. Entries are indexed by datetime format as TIMESTAMP. I would like to group all data within each minute and do some calculations and manipulations. That is, I would like to take all data within 09:00:00 to 09:00:59 and report some things about what happened in this minute. I would then like to do the same calculations and manipulations from 09:01:00 to 09:01:59 and so on through to the end of my dataset.

I've been fiddling around with groupby() and .resample() but I have had no success so far. I can think of a very inelegant way to do it with a series of for loops and if statements but I was wondering if there was an easier way here.

ddejohn · Accepted Answer · 2021-08-26T05:13:23.833

You didn't provide any data or code, so I'll just make some up. You also don't specify what calculations you want to do, so I'm just taking the mean:

>>> import numpy as np
>>> import pandas as pd
>>> dates = pd.date_range("1/1/2020 00:00:00", "1/1/2020 03:00:00", freq="S")
>>> values = np.random.random(len(dates))
>>> df = pd.DataFrame({"dates": dates, "values": values})
>>> df.resample("1Min", on="dates").mean().reset_index()
                  dates    values
0   2020-01-01 00:00:00  0.486985
1   2020-01-01 00:01:00  0.454880
2   2020-01-01 00:02:00  0.467397
3   2020-01-01 00:03:00  0.543838
4   2020-01-01 00:04:00  0.502764
..                  ...       ...
236 2020-01-01 03:56:00  0.478224
237 2020-01-01 03:57:00  0.460435
238 2020-01-01 03:58:00  0.508211
239 2020-01-01 03:59:00  0.415030
240 2020-01-01 04:00:00  0.050993

[241 rows x 2 columns]

Doing calculations on higher frequency data in lower frequency bins in Pandas

1 Answers1