2

Is there a way to group time series on all the dates every 30 minutes or x minutes. My question is very similar to this one. If I wanted to group by hours in a date, I would just need

data=pd.DataFrame({'Values':  1,'Date':pd.date_range('01-01-2017',periods=3600,freq='T')})
data.set_index(['Date'],inplace=True)
grouped=data.groupby(data.index.hour)

Running a for-loop and printing the last group give me:

for time,group in grouped:
    print(group)   
                     Values
Date                       
2017-01-01 23:00:00       1
2017-01-01 23:01:00       1
2017-01-01 23:02:00       1
2017-01-01 23:03:00       1
2017-01-01 23:04:00       1
2017-01-01 23:05:00       1
2017-01-01 23:06:00       1
2017-01-01 23:07:00       1
2017-01-01 23:08:00       1
2017-01-01 23:09:00       1
2017-01-01 23:10:00       1
2017-01-01 23:11:00       1
2017-01-01 23:12:00       1
2017-01-01 23:13:00       1
2017-01-01 23:14:00       1
2017-01-01 23:15:00       1
2017-01-01 23:16:00       1
2017-01-01 23:17:00       1
2017-01-01 23:18:00       1
2017-01-01 23:19:00       1
2017-01-01 23:20:00       1
2017-01-01 23:21:00       1
2017-01-01 23:22:00       1
2017-01-01 23:23:00       1
2017-01-01 23:24:00       1
2017-01-01 23:25:00       1
2017-01-01 23:26:00       1
2017-01-01 23:27:00       1
2017-01-01 23:28:00       1
2017-01-01 23:29:00       1
                    ...
2017-01-02 23:30:00       1
2017-01-02 23:31:00       1
2017-01-02 23:32:00       1
2017-01-02 23:33:00       1
2017-01-02 23:34:00       1
2017-01-02 23:35:00       1
2017-01-02 23:36:00       1
2017-01-02 23:37:00       1
2017-01-02 23:38:00       1
2017-01-02 23:39:00       1
2017-01-02 23:40:00       1
2017-01-02 23:41:00       1
2017-01-02 23:42:00       1
2017-01-02 23:43:00       1
2017-01-02 23:44:00       1
2017-01-02 23:45:00       1
2017-01-02 23:46:00       1
2017-01-02 23:47:00       1
2017-01-02 23:48:00       1
2017-01-02 23:49:00       1
2017-01-02 23:50:00       1
2017-01-02 23:51:00       1
2017-01-02 23:52:00       1
2017-01-02 23:53:00       1
2017-01-02 23:54:00       1
2017-01-02 23:55:00       1
2017-01-02 23:56:00       1
2017-01-02 23:57:00       1
2017-01-02 23:58:00       1
2017-01-02 23:59:00       1

But there is no times.30min command.

EDIT: I am trying to group every thirty minutes for all the dates. I want the output exactly like the above command but for an x minutes. The default grouped = df.groupby(pd.TimeGrouper('30T')) does not work, because it groups the dates individually. So if I have minute timestamps of dates say 2017-01-01 and 2017-01-02, pd.TimeGrouper('30T'), splits up 0:00-0:30 2017-01-01 and 0:00-0:30 2017-01-02 into seperate groups. I would like them combined.

Community
  • 1
  • 1
Bobe Kryant
  • 2,050
  • 4
  • 19
  • 32

2 Answers2

4

Use pd.TimeGrouper

grouped = df.groupby(pd.TimeGrouper('30T'))

Demo

tidx = pd.date_range('2017-03-01', periods=240, freq='T')

df = pd.DataFrame(np.random.randint(10, size=(len(tidx), 3)), tidx, list('ABC'))

df.groupby(pd.TimeGrouper('30T')).sum()

                       A    B    C
2017-03-01 00:00:00  134  115  145
2017-03-01 00:30:00  139  113  130
2017-03-01 01:00:00  159  123  147
2017-03-01 01:30:00  115  143  156
2017-03-01 02:00:00  144  145  134
2017-03-01 02:30:00  142  143  166
2017-03-01 03:00:00  152  154  130
2017-03-01 03:30:00  121  121  170
piRSquared
  • 285,575
  • 57
  • 475
  • 624
4

Okay, I think I you want to do this instead:

data=pd.DataFrame({'Values':  1,'Date':pd.date_range('01-01-2017',periods=3600,freq='T')})
data.set_index(['Date'],inplace=True)

g = data.groupby([data.index.hour, data.index.minute // 15])
#this will group the each day in 15-minute intervals for all dates.
for n,g in grouped:
    print(g)

Output:

                    Values
Date                       
2017-01-01 00:00:00       1
2017-01-01 00:01:00       1
2017-01-01 00:02:00       1
2017-01-01 00:03:00       1
2017-01-01 00:04:00       1
2017-01-01 00:05:00       1
2017-01-01 00:06:00       1
2017-01-01 00:07:00       1
2017-01-01 00:08:00       1
2017-01-01 00:09:00       1
2017-01-01 00:10:00       1
2017-01-01 00:11:00       1
2017-01-01 00:12:00       1
2017-01-01 00:13:00       1
2017-01-01 00:14:00       1
2017-01-02 00:00:00       1
2017-01-02 00:01:00       1
2017-01-02 00:02:00       1
2017-01-02 00:03:00       1
2017-01-02 00:04:00       1
2017-01-02 00:05:00       1
2017-01-02 00:06:00       1
2017-01-02 00:07:00       1
2017-01-02 00:08:00       1
2017-01-02 00:09:00       1
2017-01-02 00:10:00       1
2017-01-02 00:11:00       1
2017-01-02 00:12:00       1
2017-01-02 00:13:00       1
2017-01-02 00:14:00       1

You can use pd.Grouper or resample. (Note: pd.TimeGrouper has no advantages over pd.Grouper.)

Method 1 use pd.Grouper:

df = pd.DataFrame({'Values':  1,'Date':pd.date_range('01-01-2017',periods=3600,freq='T')})

df.groupby(pd.Grouper(freq='15T',key='Date')).sum()

Output:

                     Values
Date                       
2017-01-01 00:00:00      15
2017-01-01 00:15:00      15
2017-01-01 00:30:00      15
2017-01-01 00:45:00      15
2017-01-01 01:00:00      15

Method 2 use resample:

df.resample('15T', on='Date').sum()

Output:

                     Values
Date                       
2017-01-01 00:00:00      15
2017-01-01 00:15:00      15
2017-01-01 00:30:00      15
2017-01-01 00:45:00      15
2017-01-01 01:00:00      15

Alternatively to both methods if your date is in the index as a datetime dtype, then you have a DatetimeIndex, you need to omit the key and on paramaters in Method 1 and Method 2, respectively.

Scott Boston
  • 147,308
  • 15
  • 139
  • 187