0

I have a CSV file attached below, I need to split the time for 1 day into 4 intervals, i.e: 9:15-10:00, 10:01-12:00, 12:01-14:30 & 14:31-15:30 and get the values of 'Open' as the first value in every interval, 'High' as the max value, 'low' as the lowest, 'close' as the last value of the interval and 'volume' as the sum of all volume values in an interval. This same procedure is to be done for everyday. I have tried using group by function for specific intervals but it gives the values for whole csv file. I have also tried resampling but it doesn't work for me. Here is the code which I tried.

enter image description here

ohlc_dict = {
        'Open':'first',
        'High':'max',
        'Low':'min',
        'Close':'last',
        'Volume':'sum'
        }

df = df.resample('60min').agg(ohlc_dict)

And one more:

df = df.groupby(pd.Grouper(freq='60Min',closed='right',label='right')).agg({
                                          "Open":  "first",
                                          "High":  "max",
                                          "Low":   "min",
                                         "Close": "last",
                                          "Volume": "sum"
})
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57

2 Answers2

1

You need to combine the 'Date' and 'Time' column, then convert them to datetime objects before grouping

df["Date_Time"] = df['Date'].astype(str) + df['Time'].astype(str)

df['Date_Time'] = pd.to_datetime(df['Date_Time'], format='%Y%m%d%H%M')

df = df.groupby(pd.Grouper(key='Date_Time', freq='60Min',closed='right',label='right')).agg({
                                          "Open":  "first",
                                          "High":  "max",
                                          "Low":   "min",
                                         "Close": "last",
                                          "Volume": "sum"
})
darth baba
  • 1,277
  • 5
  • 13
  • Hi, I have already done with the date_time conversion and the groupby method is giving me the data for 60Min duration throughout CSV which I don't require, I want to group the data for a particular date and for a particular time. example: For 01-01-2015 I require the data only from 9.15-10.00 am. – user16269182 Jun 20 '21 at 08:22
  • Use the DataFrame.groupby with a list of pandas.Grouper something like this df.groupby([ pd.Grouper('Date', freq='D'), pd.Grouper('Time', freq='60Min') ]) – darth baba Jun 20 '21 at 15:52
0

Solution code for the problem statement This code worked fine for the problem statement above. Here column 'combined' is the datetime column.Solved Code File