Currently im working on a script to fill in missing values in a timeline. The dataframe is looking thusfar, values are examples:
timestamp id value
2016-01-01 01:00:00 1 10
2016-01-01 02:00:00 1 15
2016-01-01 03:00:00 1 12
2016-01-01 04:00:00 1 NAN
2016-01-01 05:00:00 1 NAN
2016-01-01 06:00:00 1 7
2016-01-01 07:00:00 1 9
2016-01-01 01:00:00 2 10
2016-01-01 02:00:00 2 12
2016-01-01 03:00:00 2 14
2016-01-01 04:00:00 2 16
2016-01-01 05:00:00 2 8
2016-01-01 06:00:00 2 11
2016-01-01 07:00:00 2 14
2016-01-01 01:00:00 3 10
2016-01-01 02:00:00 3 12
2016-01-01 03:00:00 3 14
2016-01-01 04:00:00 3 12
2016-01-01 05:00:00 3 8
2016-01-01 06:00:00 3 9
2016-01-01 07:00:00 3 12
What i would like is to fill in the NAN values by looking at the rise in values by the other object's. So for example if object id 2 and 3 rise 50% on average from 3:00:00 to 4:00:00 i can use the value 12 in this example and multiply it by 1.5. Im working in python at the moment using pandas.
I can run a function through the column 'value' checking if its NAN. But im sorta stuck how to go from there.
Basicly i want a function that when it finds a missing value to take the timestamp, then for that timestamp find the values the other objects. So in this example its missing data at 4:00:00, it will grab 16 of object 2 and 12 of object 3. Once it done that it will compare the found value to the value before this timestamp and divide by this. So 16/14 and 12/14. the average of this is 1, which i want multiplied by the value before the missing value. So 1.00 * 12.00. It has then filled in the timestamp at 4:00:00 and continue at 5:00:00