4

Beginner panda/python user. I am using 24 hour data in pandas dataframe, however there is often no data for the last few minutes of the day.

I simply need to append rows onto each file until the last Timestamp reaches 23.59, and forward fill those last few minutes with data. So this:

    19-12-2016 00:00    2   0.003232323
    ...
    19-12-2016 23:53    2   0.002822919
    19-12-2016 23:54    4   0.002822919
    19-12-2016 23:55    1   0.002822919

becomes:

    19-12-2016 00:00    2   0.003232323
    ...
    19-12-2016 23:53    2   0.002822919
    19-12-2016 23:54    4   0.002822919
    19-12-2016 23:55    1   0.002822919
    19-12-2016 23:56    1   0.002822919
    19-12-2016 23:57    1   0.002822919
    19-12-2016 23:58    1   0.002822919
    19-12-2016 23:59    1   0.002822919

Unfortunately the code I am using for this is really long and I can't pinpoint exactly where I could amend this.

warrenfitzhenry
  • 2,209
  • 8
  • 34
  • 56

2 Answers2

5

you can:

reindex your data as

idx = pd.date_range('2016-12-19', periods=1440, freq='T')
df = df.reindex(idx)

and then forward fill any missing values using df.mycol.ffill()

ℕʘʘḆḽḘ
  • 18,566
  • 34
  • 128
  • 235
3

A generic solution to multiple days of data in a single frame might look something like this. Get the start and end date and then reindex the entire frame and fill in the missing values.

start = df.index.min().date() 
end = df.index.max().date() + pd.Timedelta(1, 'D')
df.reindex(pd.date_range(start, end, freq='T', closed='left')).fillna(method='ffill')
Ted Petrou
  • 59,042
  • 19
  • 131
  • 136