1

Introduction

(DB-fiddle here: https://dbfiddle.uk/o5gsnxpt)

Say you manage busses and that you have two tables, bus_cancellations and bus_timetables:

The table bus_cancellations contains all cancellations of bus journeys (only one shown here) where each row represents the part of the timetable that has been cancelled (e.g., the whole timetable = bus is completely cancelled that day | partial cancellation = the bus will drive between certain stops only). Furthermore, the time of when the bus journey was cancelled is captured in the column cancellation_time.

bus_id driving_day cancellation_type stop_from stop_to cancellation_time
1 2023-05-01 Partial Start Stop 1 Stop 3 2023-05-01T12:21:54

bus_timetables: A table containing all the timetable versions that have ever been present for all busses on all days. A bus can have multiple timetables through time as planning might change timetables well before the day that the bus will drive and changes can happen to the timetable of a bus after the bus should have departed (had it been cancelled or not). The time of when the timetable was in effect is captured in the column timetable_creation_time.

bus_id driving_day timetable_version stop timetable_stop_entry_position arrive_time depart_time timetable_creation_time
1 2023-05-01 1 Stop 1 0 NULL 13:42 2023-03-14T13:43:21
1 2023-05-01 1 Stop 2 1 13:54 13:56 2023-03-14T13:43:21
1 2023-05-01 1 Stop 3 2 14:04 14:05 2023-03-14T13:43:21
1 2023-05-01 1 Stop 4 3 14:13 NULL 2023-03-14T13:43:21
1 2023-05-01 2 Stop 1 0 NULL 13:56 2023-05-01T04:42:32
1 2023-05-01 2 Stop 2 1 14:08 14:10 2023-05-01T04:42:32
1 2023-05-01 2 Stop 3 2 14:18 14:19 2023-05-01T04:42:32
1 2023-05-01 2 Stop 4 3 14:27 NULL 2023-05-01T04:42:32
1 2023-05-01 3 Stop 1 0 NULL 14:01 2023-05-02T12:43:27
1 2023-05-01 3 Stop 2 1 14:13 14:15 2023-05-02T12:43:27
1 2023-05-01 3 Stop 3 2 14:23 14:24 2023-05-02T12:43:27
1 2023-05-01 3 Stop 4 3 14:34 NULL 2023-05-02T12:43:27

Objective

  1. In the bus_timetables table, identify the timetable that was in effect at the time of the bus getting cancelled (bus_cancellations.cancellation_time and bus_timetables.timetable_creation_time). As far as I can reason, this would be the most recent timetable among the set of timetables that have a creation time on or before the bus cancellation time.

  2. Expand the cancelled part of the timetable (the part of the timetable between stop_from and stop_to in bus_cancellations) into a row per relevant entry in the bus_timetables table (so as to get the correct stops information on a stop-per-row level.

Using above example tables, this would result in a table like the following:

bus_id driving_day cancellation_type stop_from stop_to cancellation_time stop arrive_time depart_time
1 2023-05-01 Partial Start Stop 1 Stop 3 2023-05-01T12:21:54 Stop 1 NULL 13:56
1 2023-05-01 Partial Start Stop 1 Stop 3 2023-05-01T12:21:54 Stop 2 14:08 14:10
1 2023-05-01 Partial Start Stop 1 Stop 3 2023-05-01T12:21:54 Stop 3 14:18 14:19

Progress so far

For objective 1, in order to retrieve the correct rows (those that are from the timetable that was in effect at the time of the cancellation), I've tried working with lateral join, but so far I haven't been able to get any query working.

For objective 2, I've understood that one way of expanding the stops between the stop_from and stop_to values in bus_cancellations would be to use a correlated subquery (unsure if this is also a suboptimal solution to this part of the query):

SELECT
  c.bus_id,
  c.driving_day,
  stop_from,
  stop_to,
  stop,
  timetable_stop_entry_position,
  t.timetable_version,
  t.timetable_creation_time
FROM bus_cancellations AS c
LEFT JOIN bus_timetables AS t
ON
  c.bus_id = t.bus_id
  AND c.driving_day = t.driving_day
WHERE
  timetable_stop_entry_position BETWEEN (SELECT timetable_stop_entry_position FROM bus_timetables WHERE stop = c.stop_from LIMIT 1)
  AND (SELECT timetable_stop_entry_position FROM bus_timetables WHERE stop = c.stop_to LIMIT 1);

So far, I've been unable to figure out a complete (and performant) solution that solves both objectives.

  • 1
    It sounds like you need a scalar lookup (subquery, as you described to find the relevant version number. After that I think it's just a simple inner join on the appropriate argument including a check to verify that the stop falls within the range. Be careful with an alphabetic comparison here. – shawnt00 Jul 29 '23 at 23:10

0 Answers0