I am building a sql query with a large set of data but query is too slow
I've got 3 tables; movies
, movie_categories
, skipped_movies
The movies
table is normalized and I am trying to query a movie based on a category while excluding ids from skipped_movies
table.
However I am trying to use WHERE IN and WHERE NOT IN to in my query.
movies
table has approx. 2 million rows (id, name, score)
movie_categories
approx. 5 million (id, movie_id, category_id)
skipped_movies
has approx. 1k rows (id, movie_id, user_id)
When the skipped_movies
table is very small 10 - 20 rows the query is quite fast. (about 40 - 50 ms) but when the table gets somewhere around 1k of data I get somewhere around 7 to 8 seconds on the query.
This is the query I'm using.
SELECT SQL_NO_CACHE * FROM `movies` WHERE `id` IN (SELECT `movie_id` FROM `movie_categories` WHERE `category_id` = 1) AND `id` NOT IN (SELECT `movie_id` FROM `skipped_movies` WHERE `user_id` = 1) AND `score` <= 9 ORDER BY `score` DESC LIMIT 1;
I've tried many ways that came to mind but this was the fastest one. I even tried the EXISTS
method to no extent.
I'm using the SQL_NO_CACHE just for testing.
And I guess that the ORDER BY statement is running very slow.