-2

CSV file. df before resample and after applying:

df["dateandtime"] = (pd.to_datetime(df.pop("DATE").str.cat(df.pop("TIME"), sep=" ")))
df = df.set_index(pd.DatetimeIndex(df.pop("dateandtime")))
dateandtime Open High Low Close VOLUME
2020-03-11 00:00:00-04:00 2812.75 2813.25 2811.25 2811.25 296
2020-03-11 00:01:00-04:00 2811.25 2811.5 2809.25 2809.5 359
2020-03-11 00:02:00-04:00 2809.25 2810 2808.25 2809.5 189
2020-03-11 00:03:00-04:00 2809.5 2809.5 2806.5 2806.75 602
2020-03-11 00:04:00-04:00 2806.5 2809.75 2806.5 2809 299

How do I resample this 1 minute candlestick data into 15 minute data? I tried:

from dateutil.tz import gettz
import pandas as pd
import finplot as fplt


df = pd.read_csv('/home/user/Documents/finance/fin-smart/lab/ES.csv')
df.rename(columns={'OPEN': 'Open'}, inplace=True)
df.rename(columns={'HIGH': 'High'}, inplace=True)
df.rename(columns={'LOW': 'Low'}, inplace=True)
df.rename(columns={'CLOSE': 'Close'}, inplace=True)
df["dateandtime"] = (pd.to_datetime(df.pop("DATE").str.cat(df.pop("TIME"), sep=" ")))
df = df.set_index(pd.DatetimeIndex(df.pop("dateandtime")))

# first day is 2010.01.04
# last day is 2020.03.13
start_date = '2020.03.12'
end_date = '2020.03.13'
df = df.loc[start_date:end_date]

df.resample("15T").agg(
    Open=("Open", "first"),
    High=("High", "max"),
    Low=("Low", "min"),
    Close=("Close", "last"),
)

fplt.display_timezone = gettz('America/Chicago')
fplt.candlestick_ochl(df[['Open', 'Close', 'High', 'Low']])
fplt.show()

It gives the same output as without df.resample() and no error or warning. Is df in an incorrect format?

user4157124
  • 2,809
  • 13
  • 27
  • 42

2 Answers2

1

If you want the most convenient method that creates OHLC only from the Close column:

df.set_index("dateandtime")["Close"].resample("15T").ohlc()

A more detailed method that build OHLC from all 4 existing columns:

df.set_index("dateandtime").resample("15T").agg(
    Open=("Open", "first"),
    High=("High", "max"),
    Low=("Low", "min"),
    Close=("Close", "last"),
)
Code Different
  • 90,614
  • 16
  • 144
  • 163
0

The following code works for me (given your input file).

Not sure what exactly is wrong with your code, but it seems to me that you are doing some unnecessary stuff, and can simplify as follows:

df = pd.read_csv('ES.csv')

df.index = pd.DatetimeIndex(df['DATE']+' '+df['TIME'])
df.drop(['DATE','TIME'],axis=1,inplace=True)

df.rename(columns={'OPEN' : 'Open',
                   'HIGH' : 'High',
                   'LOW'  : 'Low',
                   'CLOSE': 'Close',
                   'VOLUME':'Volume'},inplace=True)

print(df.shape)
print(df.head())
print()
print(df.tail())

output from above prints:

(3535212, 5)
                        Open     High      Low    Close  Volume
2010-01-04 00:01:00  1113.75  1114.75  1113.25  1114.25    2185
2010-01-04 00:02:00  1114.25  1115.25  1114.25  1114.75     437
2010-01-04 00:03:00  1114.75  1114.75  1114.25  1114.50     596
2010-01-04 00:04:00  1114.25  1114.75  1114.25  1114.50     131
2010-01-04 00:05:00  1114.50  1114.50  1114.25  1114.25     166

                        Open     High     Low    Close  Volume
2020-04-07 10:46:00  2716.25  2716.50  2713.0  2714.75     504
2020-04-07 10:47:00  2714.75  2715.75  2713.0  2713.25     650
2020-04-07 10:48:00  2713.50  2713.50  2711.0  2711.25     611
2020-04-07 10:49:00  2711.00  2713.25  2709.5  2712.25     777
2020-04-07 10:50:00  2712.50  2714.50  2712.5  2714.00     297

now resample:

tdf = df.loc['2020.03.12':'2020.03.13']

ndf = tdf.resample('15T').agg(
    {'Open'  :'first',
     'High'  :'max',
     'Low'   :'min',
     'Close' :'last',
     'Volume':'sum'
    })

print(ndf.shape)
print(ndf.head())
print()
print(ndf.tail())


output from the above prints of the resampled dataframe:

(185, 5)
                        Open     High      Low    Close  Volume
2020-03-12 00:00:00  2755.00  2755.50  2739.25  2739.50     952
2020-03-12 00:15:00  2740.00  2742.00  2731.75  2735.75     730
2020-03-12 00:30:00  2735.75  2739.00  2727.50  2730.75     464
2020-03-12 00:45:00  2732.00  2736.75  2715.25  2723.75     935
2020-03-12 01:00:00  2722.75  2731.00  2716.75  2717.00    1166

                        Open     High      Low    Close  Volume
2020-03-13 21:00:00  2670.00  2697.25  2666.25  2687.50  267212
2020-03-13 21:15:00  2687.75  2688.75  2679.50  2685.25   13637
2020-03-13 21:30:00  2678.25  2678.75  2643.00  2665.75   19611
2020-03-13 21:45:00  2665.50  2668.25  2651.75  2655.75   12474
2020-03-13 22:00:00  2656.25  2658.00  2650.00  2652.75    2152
Daniel Goldfarb
  • 6,937
  • 5
  • 29
  • 61