I need to represent a sequence of events. These events are a little unusual in that they are:
- non-contiguous
- non-overlapping
- irregular duration
For example:
- 1200 - 1203
- 1210 - 1225
- 1304 - 1502
I would like to represent these events using Pandas.PeriodIndex
but I can't figure out how to create Period
objects with irregular durations.
I have two questions:
- Is there a way to create
Period
objects with irregular durations using existing Pandas functionality? - If not, could you suggest how to modify Pandas in order to provide irregular duration
Period
objects? (this comment suggests that it might be possible "using custom DateOffset classes with appropriately crafted onOffset, rollforward, rollback, and apply methods")
Notes
- The docstring for
Period
suggests that it is possible to specify arbitrary durations like5T
for "5 minutes". I believe this docstring is incorrect. Runningpd.Period('2013-01-01', freq='5T')
produces an exceptionValueError: Only mult == 1 supported
. I have reported this issue. - The "time stamps vs time spans" section in the Pandas documentation states "For regular time spans, pandas uses
Period
objects for scalar values andPeriodIndex
for sequences of spans. Better support for irregular intervals with arbitrary start and end points are forth-coming in future releases." (my emphasis)
Update 1
Building a Period
with a custom duration looks pretty straightforward. BUT I think the main stumbling block will be persuading PeriodIndex
to accept Periods
with different freqs
. e.g.:
In [93]: pd.PeriodIndex([pd.Period('2000', freq='D'),
pd.Period('2001', freq='T')])
ValueError: 2001-01-01 00:00 is wrong freq
It looks like a central assumption in PeriodIndex
is that every Period has the same freq
.