I want to complete my time serie of % humidity with missing records (or rows). Sensors are designed to record a mean value each 15min, so that is my target frequency. Here an example for one station (not the best in terms of gaps...) but I have 36 stations of measurements, 6 parameters and more than 24 000 records each to homogenize.
I choose columns of datetime and % humidity for example :
humdt = data["la-salade"][["datetime","humidite"]]
datetime humidite
0 2019-07-09 08:30:00 87
1 2019-07-09 11:00:00 87
2 2019-07-09 17:30:00 82
3 2019-07-09 23:30:00 80
4 2019-07-11 06:15:00 79
5 2019-07-19 14:30:00 39
I set datetime as an index : (so far it works)
humdt["datetime"] = pd.to_datetime(humdt["datetime"])
humdt = humdt.set_index("datetime",drop=True)
humidite
datetime
2019-07-09 08:30:00 87
2019-07-09 11:00:00 87
2019-07-09 17:30:00 82
2019-07-09 23:30:00 80
2019-07-11 06:15:00 79
2019-07-19 14:30:00 39
Beside this, I prepare a datetime range matching my expectations (15min frequency) :
date_rng = pd.period_range(start=debut, end=fin, freq='15min').strftime('%Y-%m-%d %H:%M:%S')
date_rng = pd.DataFrame(date_rng)
date_rng.columns = ["datetime"]
Then, I use this range to reindex my humidity values (expecting NaN when missing) :
humdt = humdt.reindex(pd.DatetimeIndex(date_rng["datetime"]))
humidite
datetime
2019-07-09 08:30:00 87.0
2019-07-09 08:45:00 88.0
2019-07-09 09:00:00 88.0
2019-07-09 09:15:00 88.0
2019-07-09 09:30:00 89.0
2019-07-09 09:45:00 89.0
2019-07-09 10:00:00 88.0
2019-07-09 10:15:00 88.0
2019-07-09 10:30:00 88.0
2019-07-09 10:45:00 88.0
2019-07-09 11:00:00 87.0
As a result, I get values of humidity from nowhere... not even a classical linear interpolation (ex : between 87% at 08H30 and 87% at 11H00). Please, help me, I have no clue what is going on... (also tried to merge and resampling, as here behavior is not as expected). Thank you !