1

I have timeseries data that looks like this:

datetime            generation 
2022-01-31 00:00      1234
2022-01-31 00:15      4930
2022-01-31 00:30      2092
2022-01-31 00:45      20302
2022-01-31 01:00      483
2022-01-31 01:15      4924
2022-01-31 01:30      5970
2022-01-31 01:45      3983

I would like to downsample my data from 15-minute frequencies to 1-hour frequencies. So, the first 4 rows above would be summed under 00:00 timestamp, then next 4 rows would be combined under 01:00.

datetime         generation
2022-01-31 00:00 28558
2022-01-31 01:00 15360

Is there an efficient way to make this happen?

Pixel
  • 97
  • 7
  • 3
    If you throw the title of your question into google, you find: https://stackoverflow.com/questions/52885878/how-to-downsampling-time-series-data-in-pandas and https://pandas.pydata.org/docs/reference/api/pandas.Series.resample.html – Jacob Aug 03 '22 at 12:11

1 Answers1

2

Look at pandas.DataFrame.resample

import pandas as pd
df = pd.DataFrame({
        'datetime': 
           ["2022-01-31 00:00:00","2022-01-31 00:15:00","2022-01-31 00:30:00",
            "2022-01-31 00:45:00","2022-01-31 01:00:00","2022-01-31 01:15:00",
            "2022-01-31 01:30:00","2022-01-31 01:45:00"],
        'generation':
           [1234,4930,2092,20302,483,4924,5970,3983]})
df.datetime = pd.to_datetime(df.datetime)
df.set_index('datetime', inplace=True)
df.resample('1H').sum()

would result in

                    generation
datetime                       
2022-01-31 00:00:00       28558
2022-01-31 01:00:00       15360

All you need is to get a dataframe with a datetime index.

Dima Chubarov
  • 16,199
  • 6
  • 40
  • 76