1

I am working with huge data frame:

reader = pd.read_csv("D:/...path.../test.csv", names=["id_easy","ordinal", "latitude", "longitude","epoch",'weekday'], 
                 parse_dates=['epoch'], chunksize=n_rows, error_bad_lines=False)

day_names = (('0:00', '1:00'),('1:00', '2:00'),('2:00', '3:00'),('3:00', '4:00'),('4:00', '5:00'),('5:00', '6:00'),
             ('6:00', '7:00'),('7:00', '8:00'),('8:00', '9:00'),('9:00', '10:00'),('10:00', '11:00'),('11:00', '12:00'),
             ('12:00', '13:00'),('13:00', '14:00'),('14:00', '15:00'),('15:00', '16:00'),('16:00', '17:00'),('17:00', '18:00'),
             ('18:00', '19:00'),('19:00', '20:00'),('20:00', '21:00'),('21:00', '22:00'),('22:00', '23:00'),('23:00', '00:00'))

for df in reader: 
    if not df.empty: 
        df['epoch'] = pd.to_datetime(df.epoch,unit = 's')
        df.index = pd.to_datetime(df.epoch)
        for day in day_names: 
            day_df = df.between_time[day] # ERROR IS HERE
            if not day_df.empty:
                day_df.to_csv(f'{day}.csv', index=False, header=False, mode='a')

TypeError: 'method' object is not subscriptable


Desired output is 24 .csv files like: final1,final2,...,final24


Sample data:

e35f652a    68  11.9125 3.7432  1465084811  Sunday
e35f652a    69  11.8992 3.7412  1465084870  Sunday
e35f652a    70  11.8866 3.7342  1465084930  Sunday
e35f652a    71  11.8755 3.7321  1465084990  Sunday
e35f652a    72  11.8675 3.7247  1465085050  Sunday

Somehow this question is more or less similar

Mamed
  • 1,102
  • 8
  • 23

1 Answers1

3

Change [] used for indexing to () because DataFrame.between_time() and select first and second value of tuple by indexing:

for day in day_names: 
    day_df = df.between_time(day[0], day[1])

Or change loop for unpack tuples:

for s, e in day_names: 
    day_df = df.between_time(s, e)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252