1

I am new to python programming. I have a timeseries dateset in seconds which starts at 9.15am and ends at 3.30pm each day. I am trying to downsample it to 1 min timeframe.

Example of original data set:

                             Px_NIFTY 20140130 0.0 FF  Px_NIFTY 20140130 4500.0 CE  \
 Time                                                                         
2014-01-01 09:15:01               6364.329167                          NaN   
2014-01-01 09:15:02               6366.776471                          NaN   
2014-01-01 09:15:03               6367.158824                       1854.0   
2014-01-01 09:15:04               6368.134211                       1854.0   
2014-01-01 09:15:05               6367.355000                          NaN   
...                                       ...                          ...   
2014-01-31 15:29:55                       NaN                          NaN   
2014-01-31 15:29:56                       NaN                          NaN   
2014-01-31 15:29:57                       NaN                          NaN   
2014-01-31 15:29:58                       NaN                          NaN   
2014-01-31 15:29:59                       NaN                          NaN   

When I use resample to downsample to 1 min timeframe, it creates new rows beyond 3.30pm with nans until the next day 9.15am. I have tried to find the answer to this in the forums but with no success.

The code I use to resample:

df.resample('1T',label='right').last()

incorrect output I'm getting:

                           Px_NIFTY 20140130 0.0 FF  Px_NIFTY 20140130 4500.0 CE  \
Time                                                                         
2014-01-14 05:36:00                       NaN                          NaN   
2014-01-14 05:37:00                       NaN                          NaN   
2014-01-14 05:38:00                       NaN                          NaN   
2014-01-14 05:39:00                       NaN                          NaN   
2014-01-14 05:40:00                       NaN                          NaN   
...                                       ...                          ...   
2014-01-18 17:51:00                       NaN                          NaN   
2014-01-18 17:52:00                       NaN                          NaN   
2014-01-18 17:53:00                       NaN                          NaN   
2014-01-18 17:54:00                       NaN                          NaN   
2014-01-18 17:55:00                       NaN                          NaN   

The data set only has entries from 9.15am to 3.30pm for each day.

Anurag A S
  • 725
  • 10
  • 23
abs
  • 11
  • 1

1 Answers1

0

Could you check if the following works for you:

df = (df.groupby(df.index.date).resample('T', label='right').last()
        .reset_index(level=0, drop=True))

Grouping over days (.date) restricts the resample to the "local" range of the group, a day. Otherwise the resample has gaps to fill between days and that will affect its overall result.

Timus
  • 10,974
  • 5
  • 14
  • 28