I am working with GTFS data on Android (SQlite). And I would like to improve performance when I do select queries in my database filled with GTFS data.
The query below select the stop times associated to a route at a stop:
The first sub query gets the daily stop times on thursday. The second gets all the exception stop times which are not valid for TODAY (2013-07-25). The third one gets all the exception stop time which are only valid for TODAY (2013-07-25). Then I remove the non-valid one and add the valid one to the first sub query.
select distinct stop_times_arrival_time
from stop_times, trips, calendar
where stop_times_trip_id=trip_id
and calendar_service_id=trip_service_id
and trip_route_id='11821949021891616'
and stop_times_stop_id='3377699721872252'
and calendar_start_date<='20130725'
and calendar_end_date>='20130725'
and calendar_thursday=1
and stop_times_arrival_time>='07:40'
except
select stop_times_arrival_time
from stop_times, trips, calendar, calendar_dates
where stop_times_trip_id=trip_id
and calendar_service_id=trip_service_id
and calendar_dates_service_id = trip_service_id
and trip_route_id='11821949021891694'
and stop_times_stop_id='3377699720880977'
and calendar_thursday=1
and calendar_dates_exception_type=2
and stop_times_arrival_time > '07:40'
and calendar_dates_date = 20130725
union
select stop_times_arrival_time
from stop_times, trips, calendar, calendar_dates
where stop_times_trip_id=trip_id
and calendar_service_id=trip_service_id
and calendar_dates_service_id = trip_service_id
and trip_route_id='11821949021891694'
and stop_times_stop_id='3377699720880977'
and calendar_thursday=1
and calendar_dates_exception_type=1
and stop_times_arrival_time > '07:40'
and calendar_dates_date = 20130725;
It took about 15 seconds to compute (which is very long). I am sure there is a better to do this query since I do 3 different queries (almost the same by the way) which take time.
Any idea how to improve it?
EDIT: Here is the schema:
table|calendar|calendar|2|CREATE TABLE calendar (
calendar_service_id TEXT PRIMARY KEY,
calendar_monday INTEGER,
calendar_tuesday INTEGER,
calendar_wednesday INTEGER,
calendar_thursday INTEGER,
calendar_friday INTEGER,
calendar_saturday INTEGER,
calendar_sunday INTEGER,
calendar_start_date TEXT,
calendar_end_date TEXT
)
index|sqlite_autoindex_calendar_1|calendar|3|
table|calendar_dates|calendar_dates|4|CREATE TABLE calendar_dates (
calendar_dates_service_id TEXT,
calendar_dates_date TEXT,
calendar_dates_exception_type INTEGER
)
table|routes|routes|8|CREATE TABLE routes (
route_id TEXT PRIMARY KEY,
route_short_name TEXT,
route_long_name TEXT,
route_type INTEGER,
route_color TEXT
)
index|sqlite_autoindex_routes_1|routes|9|
table|stop_times|stop_times|12|CREATE TABLE stop_times (
stop_times_trip_id TEXT,
stop_times_stop_id TEXT,
stop_times_stop_sequence INTEGER,
stop_times_arrival_time TEXT,
stop_times_pickup_type INTEGER
)
table|stops|stops|13|CREATE TABLE stops (
stop_id TEXT PRIMARY KEY,
stop_name TEXT,
stop_lat REAL,
stop_lon REAL
)
index|sqlite_autoindex_stops_1|stops|14|
table|trips|trips|15|CREATE TABLE trips (
trip_id TEXT PRIMARY KEY,
trip_service_id TEXT,
trip_route_id TEXT,
trip_headsign TEXT,
trip_direction_id INTEGER,
trip_shape_id TEXT
)
index|sqlite_autoindex_trips_1|trips|16|
And here is the query plan:
2|0|0|SCAN TABLE stop_times (~33333 rows)
2|1|1|SEARCH TABLE trips USING INDEX sqlite_autoindex_trips_1 (trip_id=?) (~1 rows)
2|2|2|SEARCH TABLE calendar USING INDEX sqlite_autoindex_calendar_1 (calendar_service_id=?) (~1 rows)
3|0|3|SCAN TABLE calendar_dates (~10000 rows)
3|1|2|SEARCH TABLE calendar USING INDEX sqlite_autoindex_calendar_1 (calendar_service_id=?) (~1 rows)
3|2|0|SEARCH TABLE stop_times USING AUTOMATIC COVERING INDEX (stop_times_stop_id=?) (~7 rows)
3|3|1|SEARCH TABLE trips USING INDEX sqlite_autoindex_trips_1 (trip_id=?) (~1 rows)
1|0|0|COMPOUND SUBQUERIES 2 AND 3 USING TEMP B-TREE (EXCEPT)
4|0|3|SCAN TABLE calendar_dates (~10000 rows)
4|1|2|SEARCH TABLE calendar USING INDEX sqlite_autoindex_calendar_1 (calendar_service_id=?) (~1 rows)
4|2|0|SEARCH TABLE stop_times USING AUTOMATIC COVERING INDEX (stop_times_stop_id=?) (~7 rows)
4|3|1|SEARCH TABLE trips USING INDEX sqlite_autoindex_trips_1 (trip_id=?) (~1 rows)
0|0|0|COMPOUND SUBQUERIES 1 AND 4 USING TEMP B-TREE (UNION)