Pandas datetimeindex intersection with more than two indexes

Question

Is it possible to use a groupby operation with Pandas datetimeindex? When grouping the index below, I would like to return the date intersection for each of the groupby objects. I know there is pandas.DatetimeIndex.intersection - however this seems best suited for comparisons of only two datetimeindexes. The data I am working with may have zero to many rows with various date ranges I am examining for a date intersection.

# example data
df = pd.DataFrame([['2020-01-01', '2020-12-31'], 
                   ['2020-01-01', '2020-02-29'],
                   ['2020-01-01', '2020-12-31'], 
                   ['2020-06-01', '2020-12-31'],
                   ['2020-08-01', '2020-10-31']], columns=['start', 'end'], index=list('AABBB'))

# apply period range for each row
df['period_range'] = df.apply(lambda x: pd.date_range(x['start'], x['end']), axis=1)

# example intersection I would like to see for index A
pd.date_range('2020-01-01', '2020-12-31').intersection(pd.date_range('2020-01-01', '2020-2-28'))

expected result

A - datetime index representing the months of Jan - Feb

B - datetime index representing the months of Aug - Oct

`df.groupby(level=0)` ? Then what do you expect to have happen per group? What would your expected output look like? — Henry Ecker, Jun 03 '21 at 01:50
@HenryEcker, I have added an example of the sort of intersection I would like to see between the datetimeindex. — mrk.jhsn, Jun 03 '21 at 02:04

score 2 · Accepted Answer · answered Jun 03 '21 at 02:07

2

In your case do with Month after date_range , then we find the intersection of list

df['period_range'] = df.apply(lambda x: pd.date_range(x['start'], x['end'],freq='m').strftime('%B'), axis=1)
out = df.groupby(level=0)['period_range'].apply(lambda x : set.intersection(*map(set,x)))
A             {January, February}
B    {September, October, August}
Name: period_range, dtype: object

answered Jun 03 '21 at 02:07

BENY

317,841
20
164
234

exactly what I was looking for @BENY! I played around with set.intersection(), however I couldn't figure out how to unpack an unknown number of lists. – mrk.jhsn Jun 03 '21 at 02:20
@mrk.jhsn happy coding ~ – BENY Jun 03 '21 at 02:20
1

This is impressive. +1 – Scott Boston Jun 03 '21 at 02:22
I think I'd been down the networkx library to solve this. Nevermind, using set math and map does it. – Scott Boston Jun 03 '21 at 02:23
1

@ScottBoston ah, set is more easy to achieve this , however, still a time costing solution for large data set. – BENY Jun 03 '21 at 02:32

Pandas datetimeindex intersection with more than two indexes

expected result

1 Answers1