-1

how to make something like groupby_dynamic but can support a user-defined index

the groupby_dynamic can support timeindex to make a operation as a resample

but can only support the range of a non-duplicate way, such as

time
day1   9:00
day1 15:00
day2  9:00
day2  15:00
day3  9:00
day3 15:00

dynamic groupby to 1D


day1  9:00
day1 15:00
--------------
day2  9:00
day2  15:00
-------------
day3  9:00
day3 15:00

the feature i ask is a more user-defined way to dynamic-groupby, and the index may be duplicated

day1  9:00
day1 15:00

day2  9:00
day2  15:00
-------------
day2  9:00
day2  15:00
day3  9:00
day3 15:00
--------------

i can use rolling in a series, but the rolling_apply waste a lot of time cause it roll every index

day1  9:00
day1 15:00

day2  9:00
day2  15:00
-------------
day1 15:00
day2  9:00
day2  15:00
day3  9:00      
--------------  -------> this window is useless
day2  9:00
day2  15:00
day3  9:00
day3  15:00
-------------

day2  15:00
day3  9:00
day3  15:00
day4  9:00   
------------  -------> this window is useless

example pic

yutiansut
  • 11
  • 5

1 Answers1

1

The solution is to give a different value between the every || period.

  • every decides the output of the index.

  • periods gives the window you need.

Examples

import datetime
df = pl.DataFrame(
    {
      "time": pl.date_range(
           low=datetime.datetime(2021, 12, 16),
            high=datetime.datetime(2021, 12, 22),
            interval="12h",
         ),
         "n": [1 for i in range(13)]
}
 )

df.groupby_dynamic('time', period='2d', every='1d',include_boundaries=True,truncate=False,closed='right').agg( pl.col('n').sum())
RiveN
  • 2,595
  • 11
  • 13
  • 26
yutiansut
  • 11
  • 5