1

I have the following tables:

  • ROUTE (id, train_number, station_id, arrival_time, departure_time)
  • SERVICE (train_number, train_name)
  • STATION (id, name, city)

A row in ROUTE table will contain the train number, it's arrival and departure time (or one of these two if the station is the first/last one) and a station_id.

I would like to find out for each train_number, which paths it goes. My difficulty is in figuring out the arrival and departure station name for each route.

For example train 1337 might have the following path:

  • stationA -> stationB:
    • route(1, 1337, stationA, null, 1800)
    • route(2, 1337, stationB, 1900, 2000)
  • stationB -> stationC:
    • route(3, 1337, stationC, 2020, 2100)
  • stationC -> stationD:
    • route(4, 1337, stationD, 2120, 2200)

I want to figure out these paths, i.e stationA -> stationB -> stationC etc.

How is it achievable (I'm using Oracle SQL if that matters)?

for stationA (getting the start of route) I could do the following:

SELECT name FROM STATION
WHERE station.id = ROUTE.station_id AND ROUTE.arrival IS NULL
Politank-Z
  • 3,653
  • 3
  • 24
  • 28
Nima
  • 6,383
  • 7
  • 46
  • 68
  • You have a similar problem with the one in the featured section. http://stackoverflow.com/questions/29583040/how-to-write-a-select-query-or-server-side-function-that-will-generate-a-neat-ti – Luigi Apr 17 '15 at 20:24
  • That can't be good :p – Nima Apr 17 '15 at 20:27

3 Answers3

5

Setup:

CREATE TABLE STATION (
  id      NUMBER PRIMARY KEY,
  name    VARCHAR2(20),
  city    VARCHAR2(20)
);

CREATE TABLE SERVICE (
  train_number  NUMBER PRIMARY KEY,
  train_name    VARCHAR2(20)
);

CREATE TABLE ROUTE (
  id              NUMBER PRIMARY KEY,
  train_number    NUMBER REFERENCES SERVICE( train_number ),
  station_id      NUMBER REFERENCES STATION( id ),
  arrival_time    NUMBER,
  departure_time  NUMBER
);

INSERT INTO STATION VALUES ( 1, 'stationA', 'city1' );
INSERT INTO STATION VALUES ( 2, 'stationB', 'city1' );
INSERT INTO STATION VALUES ( 3, 'stationC', 'city2' );
INSERT INTO STATION VALUES ( 4, 'stationD', 'city3' );

INSERT INTO SERVICE VALUES ( 1337, 'train1' );
INSERT INTO SERVICE VALUES ( 1338, 'train2' );
INSERT INTO SERVICE VALUES ( 1339, 'train3' );

INSERT INTO ROUTE VALUES (1, 1337, 1, null, 1800);
INSERT INTO ROUTE VALUES (2, 1337, 2, 1900, 2000);
INSERT INTO ROUTE VALUES (3, 1337, 3, 2020, 2100);
INSERT INTO ROUTE VALUES (4, 1337, 4, 2120, 2200);
INSERT INTO ROUTE VALUES (5, 1338, 1, null, 1800);
INSERT INTO ROUTE VALUES (6, 1338, 4, 1900, 2000);
INSERT INTO ROUTE VALUES (7, 1338, 3, 2020, 2100);
INSERT INTO ROUTE VALUES (8, 1338, 2, 2120, 2200);

Query:

WITH INDEXED_ROUTES AS (
  SELECT train_number,
         s.name,
         ROW_NUMBER() OVER( PARTITION BY train_number ORDER BY COALESCE( arrival_time, 0 ), COALESCE( departure_time, 2400 ) ) AS IDX
  FROM   ROUTE r
          INNER JOIN
          STATION s
          ON ( r.station_id = s.id )
)
SELECT  train_number,
        SUBSTR( SYS_CONNECT_BY_PATH( NAME, ' -> ' ), 5 ) AS route
FROM    indexed_routes
WHERE CONNECT_BY_ISLEAF = 1
START WITH IDX = 1
CONNECT BY PRIOR IDX + 1 = IDX
AND        PRIOR train_number = train_number;

or alternatively:

SELECT train_number,
       LISTAGG( s.name, ' -> ' ) WITHIN GROUP ( ORDER BY arrival_time ASC NULLS FIRST ) AS route
FROM   ROUTE r
        INNER JOIN
        STATION s
        ON ( r.station_id = s.id )
GROUP BY train_number;

Either will output:

TRAIN_NUMBER ROUTE
------------ --------------------------------------------
        1337 stationA -> stationB -> stationC -> stationD  
        1338 stationA -> stationD -> stationC -> stationB

If you just want pairs of stations then use the LAG or LEAD analytic functions:

SELECT train_number,
       LAG( s.name ) OVER( PARTITION BY train_number ORDER BY COALESCE( arrival_time, 0 ), COALESCE( departure_time, 2400 ) ) AS prev_station_name,
       s.name AS station_name
FROM   ROUTE r
        INNER JOIN
        STATION s
        ON ( r.station_id = s.id );

Output:

TRAIN_NUMBER PREV_STATION_NAME    STATION_NAME       
------------ -------------------- --------------------
        1337 (null)               stationA             
        1337 stationA             stationB             
        1337 stationB             stationC             
        1337 stationC             stationD             
        1338 (null)               stationA             
        1338 stationA             stationD             
        1338 stationD             stationC             
        1338 stationC             stationB      
MT0
  • 143,790
  • 11
  • 59
  • 117
1

If there are no branches in the paths then the problem is simply to order the stations by arrival or departure time:

SELECT s.station_id, name FROM ROUTE r INNER JOIN STATION s r.station_id = s.station_id ON WHERE train_number = 1 ORDER BY departure_time
user15741
  • 1,392
  • 16
  • 27
  • `NVL ( departure_time, /* maximum time */ )` if you want to cover the last station? – Politank-Z Apr 17 '15 at 20:36
  • This might work, but how would I put them in pairs? given `stationA`, how to find only `stationB` if `stationA->stationB->StationC->...`? – Nima Apr 17 '15 at 20:49
1

In MySQL you could use group_concat ordered by the arrival time, to get the path for each train in a comma separated list.

E.g.:

select train_number, group_concat(name order by arrival_time asc)
from route join station
  on route.station_id=station.id
group by train_number
order by train_number;

The output would look like:

1337    stationA,stationB,stationC,stationD

In Oracle (from 11g release 2) the following query would produce the same output:

select train_number,  LISTAGG(name, ',') WITHIN GROUP (ORDER BY arrival_time ASC NULLS FIRST)
from route join station
  on route.station_id=station.id
group by train_number
order by train_number;
lp_
  • 1,158
  • 1
  • 14
  • 21