Merge overlapping intervals in csv

Question

I need to merge overlapping intervals in my CSV file. Here is the sample data I have

Tasks	Start	End
Read	10:00	12:00
Read	11:10	13:00
Read	13:50	14:00
Write	14:00	14:30
Write	11:00	15:00

The sample output should be like

Read - [10:00,13:00], [13:50,14:00]

Write - [14:00,15:00]

I see a similar post that does the same work with pandas but it's not allowed to use here

Efficient merge overlapping intervals in same pandas dataframe with start and finish columns

I am trying to use either file reader or CSV reader to parse that file and get the rows in sorted order and compare the indices for overlapping intervals but I am facing challenges in parsing this file

import csv

dict = {}
with open('sample.csv', mode='r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
        name = row["Tasks"]
        start_time = row["Start"]
        end_time = row["End"]
        dict[tasks] = [start_time,end_time]
        print(dict)

Any help will be appriciated

mozway · Answer 1 · 2022-08-09T05:24:20.360

0

Assuming the times are sorted, you can compute custom groups and aggregate:

# get next start per task
next_start = pd.to_timedelta(df['Start']+':00').groupby(df['Tasks']).shift(-1)
# get current end
end = pd.to_timedelta(df['End']+':00')

# if next start ≥ current end, per task, merge
out = (df.groupby(['Tasks', next_start.ge(end).cumsum()], as_index=False)
         .agg({'Start': 'first', 'End': 'last'})
       )

output:

   Tasks  Start    End
0   Read  10:00  12:00
1   Read  11:10  14:00
2  Write  14:00  15:00

NB. I used timdelta as comparison objects, you could also use strings with:

next_start = df['Start'].groupby(df['Tasks']).shift(-1, fill_value='00:00')
end = df['End']

edited Aug 09 '22 at 05:24

answered Aug 09 '22 at 05:18

mozway

194,879
13
39
75

But i can't use pandas here. Need to use native python approch – Rishabh Aug 09 '22 at 05:23
then why the pandas tag? Is this an assignment? – mozway Aug 09 '22 at 05:24
My bad removed that – Rishabh Aug 09 '22 at 05:25
Its tied to an environment where pandas not allowed, I mentioned in the question – Rishabh Aug 09 '22 at 05:33

Merge overlapping intervals in csv

1 Answers1