I have a dataframe (df
, time as index and 1 column 'Pt0') that I want to upsample and interpolate with "nearest neighbor" method.
I have 2 issues:
- When I compute
df = df.upsample('1D')
, I get an object core.resample.DatetimeIndexResampler which keeps me from recovering the values of my column (but I can get the index) while I only want a dataframe as output. What I don't understand is that applying this command to other dataframes usually gives me a dataframe, not that "core" object. - If I apply the upsampling and interpolation directly:
df = df.resample('1D').interpolate(method='nearest')
I only obtain NaNs while before I had NaNs and values.
I don't understand what I am doing wrong, and I wasn't able to understand why a "core" object is created while this same method (df.resample('1D')
) gave me dataframes in other cases.
How can I solve this problem ?
Ps: df does not have duplicates in the index because it was computed specifically to avoid any (Pandas drop duplicates and replace the value by the nanmean of the duplicates).
Here is the dataframe:
df
Out[174]:
Pt0
1984-06-10 00:00:00.096000064 -42.0
1984-07-20 00:00:00.176000000 NaN
1984-07-28 00:00:00.192000000 -26.0
1984-10-08 00:00:00.336000064 -12.0
1984-10-16 00:00:00.352000000 -5.0
...
2021-04-05 08:48:28.559141120 -248.0
2021-04-05 08:48:29.059141120 -318.0
2021-04-19 20:36:46.060141056 -311.0
2021-05-04 03:02:44.279659008 -254.0
2021-05-29 02:55:17.930625024 -286.0
[529 rows x 1 columns]
Code to reproduce my issues:
df = pd.DataFrame({'Pt0': [np.nan, -42.0, np.nan, np.nan, -26.0, np.nan, np.nan, np.nan, 0.0, -10.0]},
index=['1984-06-10 00:00:00.096000064', '1984-06-10 00:00:00.096000064',
'1984-07-20 00:00:00.176000000', '1984-07-20 00:00:00.176000000',
'1984-07-28 00:00:00.192000000', '1984-07-28 00:00:00.192000000',
'1984-09-06 00:00:00.080000000', '1984-09-06 00:00:00.080000000',
'1984-09-06 00:00:00.271999936', '1984-09-06 00:00:00.271999936'])
df.index = pd.to_datetime(df.index)
df = df.groupby(level=0)['Pt0'].transform(np.nanmean).drop_duplicates().to_frame()
df2 = df.resample('1D')
df3 = df.resample('1D').interpolate('nearest')