I have a very large dataframe (~1 million rows) made by dask.dataframe
method in the following format (note that 'Timestamp
' column is actually the index column of the dataframe' and that the seconds are truncated as a result of copy and paste from excel sheet):
Dask Index Structure:
npartitions=49
2022-06-30 19:43:30 datetime64[ns]
2022-07-01 01:46:43 ...
...
2022-07-13 04:17:22 ...
2022-07-13 10:50:46 ...
Name: Timestamp, dtype: datetime64[ns]
Dask Name: sort_index, 196 tasks
I want to interpolate the values of the col1
through col2
at an exact seconds unit given the unequal distance between adjacent times; sometimes the index increments in 2s, sometimes in 3s, etc. Here is my try:
interpolated = pd.DataFrame(columns=df.columns)
for col in df.columns:
interpolated[col] = df[col].resample('S').interpolate(method='polynomial', order=2)
And, here is the error message:
Traceback (most recent call last):
File "c:\Users\username\Desktop\anomaly_detector.py", line 33, in <module>
main()
File "c:\Users\username\Desktop\anomaly_detector.py", line 18, in main
data_sampler(10)
File "c:\Users\username\Desktop\utilities.py", line 895, in data_sampler
interpolated[col] = df[col].resample('S').interpolate(method='polynomial',
AttributeError: 'Resampler' object has no attribute 'interpolate'