1

I'm working with some times series data and need to filter out some garbage. The goal is to keep the time stamps and interpolate data that is junk.

I've tried just filtering int out and reindexing, but python doesn't seem to treat datetime indices the same.

So, tried

ogIndex = df.index
df = df[df[col to filter] > some filter #]   # drops the index
df.reindex(ogIndex)

......didn't work

1 Answers1

1

Assuming the timestamps are the dataframe's index - Instead of dropping it with df = df[df[col to filter] > some filter], just inverse the filter (to select rows you do not want to keep) and set the filtered rows to NaN:

import numpy as np

df[df[col to filter] < some filter] = np.nan

This preserves the index and makes the row applicable for interpolation. Afterwards, you can use an interpolation method, e.g. something like this:

df.interpolate(method='linear', limit_direction='forward', axis=0)
lukasboh
  • 11
  • 2