Pandas Dataframe Pivot and reindex n timeseries

Question

I have a pandas dataframe containing n time series in the same Datetime column, each one associated to a different Id, with a corresponding value associated. I would like to pivot the table and reindex to the nearest timestamp. Notice that there can be cases where a timestamp is missing, as in Id-3, in this case the value would need to become NaN.

Datetime       Id Value
5-26-17 8:00   1  2.3
5-26-17 8:30   1  4.5
5-26-17 9:00   1  7
5-26-17 9:30   1  8.1
5-26-17 10:00  1  7.9
5-26-17 10:30  1  3.4
5-26-17 11:00  1  2.1
5-26-17 11:30  1  1.8
5-26-17 12:00  1  0.4
5-26-17 8:02   2  2.6
5-26-17 8:32   2  4.8
5-26-17 9:02   2  7.3
5-26-17 9:32   2  8.4
5-26-17 10:02  2  8.2
5-26-17 10:32  2  3.7
5-26-17 11:02  2  2.4
5-26-17 11:32  2  2.1
5-26-17 12:02  2  0.7
5-26-17 8:30   3  4.5
5-26-17 9:00   3  7
5-26-17 9:30   3  8.1
5-26-17 10:00  3  7.9
5-26-17 10:30  3  3.4
5-26-17 11:00  3  2.1
5-26-17 11:30  3  1.8
5-26-17 12:00  3  0.4

Expected results:

Datetime      Id-1 Id-2 Id-3
5-26-17 8:00  2.3  2.6  NaN
5-26-17 8:30  4.5  4.8  4.5
5-26-17 9:00  7    7.3  7
5-26-17 9:30  8.1  8.4  8.1
5-26-17 10:00 7.9  8.2  7.9
5-26-17 10:30 3.4  3.7  3.4
5-26-17 11:00 2.1  2.4  2.1
5-26-17 11:30 1.8  2.1  1.8
5-26-17 12:00 0.4  0.7  0.4

How would you do this?

use pd.pivot_table. http://pbpython.com/pandas-pivot-table-explained.html — CK Chen, May 28 '18 at 08:44

jezrael · Answer 1 · 2018-05-28T08:58:12.507

I believe need convert column to datetimes and floor by 30 minutes by floor, last pivot and add_prefix:

df['Datetime'] = pd.to_datetime(df['Datetime']).dt.floor('30T')
df = df.pivot('Datetime','Id','Value').add_prefix('Id-')
print (df)
Id                   Id-1  Id-2  Id-3
Datetime                             
2017-05-26 08:00:00   2.3   2.6   NaN
2017-05-26 08:30:00   4.5   4.8   4.5
2017-05-26 09:00:00   7.0   7.3   7.0
2017-05-26 09:30:00   8.1   8.4   8.1
2017-05-26 10:00:00   7.9   8.2   7.9
2017-05-26 10:30:00   3.4   3.7   3.4
2017-05-26 11:00:00   2.1   2.4   2.1
2017-05-26 11:30:00   1.8   2.1   1.8
2017-05-26 12:00:00   0.4   0.7   0.4

Another solution is use resample with mean:

df['Datetime'] = pd.to_datetime(df['Datetime'])

df = (df.set_index('Datetime')
        .groupby('Id')
        .resample('30T')['Value']
        .mean().unstack(0)
        .add_prefix('Id-'))

print (df)
Id                   Id-1  Id-2  Id-3
Datetime                             
2017-05-26 08:00:00   2.3   2.6   NaN
2017-05-26 08:30:00   4.5   4.8   4.5
2017-05-26 09:00:00   7.0   7.3   7.0
2017-05-26 09:30:00   8.1   8.4   8.1
2017-05-26 10:00:00   7.9   8.2   7.9
2017-05-26 10:30:00   3.4   3.7   3.4
2017-05-26 11:00:00   2.1   2.4   2.1
2017-05-26 11:30:00   1.8   2.1   1.8
2017-05-26 12:00:00   0.4   0.7   0.4

Pandas Dataframe Pivot and reindex n timeseries

1 Answers1