Sample data
data = [('2020-06-23 13:43', '2020-06-24 06:43'),
('2020-06-23 18:30', '2020-06-24 11:30'),
('2020-06-23 14:57', '2020-06-24 07:07'),
('2020-06-23 16:37', '2020-06-24 09:03'),
('2020-06-23 21:25', '2020-06-24 12:54')]
df1 = pd.DataFrame(data, columns=['start', 'end'], dtype='datetime64[ns]')
df2 = pd.DataFrame(index=pd.period_range(start='2020-06-23 13:00', end='2020-06-24 12:00', freq='H'))
I want to count parallel events. My desired output is percentages like this:
p0 p1 p2 p3 p4 p5 m0 m1 m2 m3 m4 m5
2020-06-23 13:00 71,7 28,3 0,0 0,0 0,0 0,0 43 17 0 0 0 0
2020-06-23 14:00 0,0 95,0 5,0 0,0 0,0 0,0 0 57 3 0 0 0
...
2020-06-23 21:00 0,0 0,0 0,0 0,0 41,7 58,3 0 0 0 0 25 35
...
2020-06-24 12:00 10,0 90,0 0,0 0,0 0,0 0,0 6 54 0 0 0 0
(m0 to m5 are intermediate results in minutes, I would need p0 to p5 as percentages).
I don't know how to start, but surely Pandas has some nice functions to deal with this?