0

My data: Data on a ticker at 3 min intervals. date/open/high/low/close

Goal: Resample to daily

I was able to figure out how to resample to daily using

df_spy_D= df_spy_3min.resample('D').agg({'open':'first',
                                         'high':'max',
                                         'low':'min',
                                         'close':'last'})

This worked by giving me this:

date                                                     
2018-12-21 00:00:00+00:00  241.10  241.10  240.18  240.30
2018-12-22 00:00:00+00:00  240.30  240.70  240.26  240.35
2018-12-23 00:00:00+00:00     NaN     NaN     NaN     NaN
2018-12-24 00:00:00+00:00  241.70  242.45  232.61  233.08
2018-12-25 00:00:00+00:00     NaN     NaN     NaN     NaN
2018-12-26 00:00:00+00:00  234.05  246.90  231.99  245.50
2018-12-27 00:00:00+00:00  245.50  248.97  238.96  247.57
2018-12-28 00:00:00+00:00  247.55  251.40  246.45  247.55
2018-12-29 00:00:00+00:00  247.52  247.66  247.52  247.57
2018-12-30 00:00:00+00:00     NaN     NaN     NaN     NaN
2018-12-31 00:00:00+00:00  249.62  250.48  247.47  250.00
2019-01-01 00:00:00+00:00  250.00  250.10  250.00  250.09
2019-01-02 00:00:00+00:00  248.00  251.21  244.70  246.65
2019-01-03 00:00:00+00:00  246.65  248.57  243.67  244.21
2019-01-04 00:00:00+00:00  244.21  253.11  243.19  252.56

Issues:

  1. The days market did not trade are NaN. I can definitely remove these
  2. There are some days where there is only partial date, I assume it might have to do with the fact that this data was pulled off an exchange where they estimate overnight prices.

The data I have is for all trade times which includes various market hours, like premarket regular US hours, aftermarket, etc

  1. So do I need to define the open and close time for the day to re-sample only for the US Session? If so, how?

  2. For days that don't have enough data, let's say count is

tigertiger
  • 73
  • 1
  • 9
  • You dindn't finish number 4. SPY is one of the most active ETF on the market and trade from 4am - 8pm. But as far as I know, only market hours (9:30am - 4pm) counts toward open/high/low/close. Just filter before you resample – Code Different Jun 12 '20 at 01:32
  • I just got filters working properly on another data-set, so yeah I will be doing that here. Yes, OHLC is Main US Session only which is 7:30am to 2pm UTC-6. It does trade outside it. It would be faster to convert to resample the data first, store it and then work on it? Eventually I will be doing statistical analysis – tigertiger Jun 12 '20 at 03:30

0 Answers0