I've got a pandas dataframe like this. This contains a timestamp
, id
, foo
and bar
.
The timestamp
data is around every 10 minutes.
timestamp id foo bar
2019-04-14 00:00:10 1 0.10 0.05
2019-04-14 00:10:02 1 0.30 0.10
2019-04-14 00:00:00 2 0.10 0.05
2019-04-14 00:10:00 2 0.30 0.10
For each id
, I'd like to create 5
additional rows
with timestamp
split equally between successive rows
, and foo
& bar
values containing random
values between the successive rows
.
The start time should be the earliest timestamp
for each id
and the end time should be the latest timestamp
for each id
So the output would be like this.
timestamp id foo bar
2019-04-14 00:00:10 1 0.10 0.05
2019-04-14 00:02:10 1 0.14 0.06
2019-04-14 00:04:10 1 0.11 0.06
2019-04-14 00:06:10 1 0.29 0.07
2019-04-14 00:08:10 1 0.22 0.09
2019-04-14 00:10:02 1 0.30 0.10
2019-04-14 00:00:00 2 0.80 0.50
2019-04-14 00:02:00 2 0.45 0.48
2019-04-14 00:04:00 2 0.52 0.42
2019-04-14 00:06:00 2 0.74 0.48
2019-04-14 00:08:00 2 0.41 0.45
2019-04-14 00:10:00 2 0.40 0.40
I can reindex the timestamp
column and create additional timestamp
rows (eg. Pandas create new date rows and forward fill column values).
But I can't seem to wrap my head around how to compute the random values for foo
and bar
between the successive rows.
Appreciate if someone can point me in the right direction!