I have the following query:
select *
from
`twitter_posts`
where
`main_handle_id` in (
select
`twitter`.`main_handle_id`
from
`users`
inner join `twitter` on `twitter`.`user_id` = `user`.`id`
where
`users` LIKE 'foo'
)
order by created_at
This query runs dramatically slow when the order by is used. The twitter_posts
table has indexes on the created_at
timestamp and on the main_handle_id
column.
I used "explain" to see what the engine was planning on doing and noticed a filesort... which I was not expecting to see as an index exists on the twitter_posts
table. After skimming through the various posts covering this on Stackoverflow and blogs, none of the examples worked for me. I had a hunch, that maybe, the sql optimizer got confused somehow with the nested select. So I wrapped the query and ordered the result of that by created_at
:
select * (select *
from
`twitter_posts`
where
`main_handle_id` in (
select
`twitter`.`main_handle_id`
from
`users`
inner join `twitter` on `twitter`.`user_id` = `user`.`id`
where
`users` LIKE 'foo'
)
)
order by created_at
Using explain on this one, DOES use the index to sort and returns the response in a fraction of what it takes with filesort. So I've "solved" my problem, but I don't understand why SQL would make a different choice based on the solution. As far as I can see, I'm just wrapping the result and ask everything of that result in a seemingly redundant statement...
Edit: I'm still unaware of why the optimizer started using the index on the created_at when using a redundant subquery, but in some cases it wasn't faster. I've now gone with the solution to add the statement "FORCE INDEX FOR ORDER BY created_at_index_name".