I'm massaging some traffic data. Most files are small, but some run over 100Mb.
I want just to get the stops for each route. To do that (grind teeth), I have to compare all the major files as bits of data are spread throughout.
The query works but by the time the processing is finished, the customer has missed most busses.
This is the query, and any help is appreciated:
SET SQL_BIG_SELECTS=1;
SELECT DISTINCT
stops.stop_name,
trips.route_id,
stops.stop_id,
trips.direction_id,
stop_times.stop_sequence,
routes.route_long_name
FROM stop_times
LEFT JOIN stops USING (stop_id)
LEFT JOIN trips USING (trip_id)
LEFT JOIN routes USING (route_id)
WHERE routes.route_id = 'xxx'
ORDER BY trips.direction_id, stop_times.stop_sequence ASC
As requested, these are the relevant table definitions:
routes
route_id routename
stops
stop_id stopname
trips
route_id trip_id
times
stop_id trip_id
These are the fields that link to each other... There are text fields in each table of course.