4

I have two MySQL tables, Post, and Account.

I use NodeJS for my backend API, and it generates the following SQL statement:

select `p`.`id` as `post_id`
from `Post` as `p` 
left join `Account` as `a` 
    on `a`.`id` = `p`.`author` 
where `p`.`id` in ('9', '10', '76', '77', 123) 
order by `p`.`id` asc

the code works, the problem it's SLOW! I tried to use EXPLAIN to check what happened, it doesn't use INDEX:

explaing0

But, then I notice that the IN clause is mixed with number and string, so I change 123 to '123' and EXPLAIN again, it uses INDEX now:

explaing1

So, my question is what happens when numbers and varchars are mixed inside an IN?

Thanks in advance

Lajos Arpad
  • 64,414
  • 37
  • 100
  • 175
rickytan
  • 110
  • 6
  • 1
    Ricky, is the performance of the query improved if you use (9, 10, 76, 77, 123)? How does performance change if you use ('9', '10', '76', '77', '123')? – Lajos Arpad Nov 14 '15 at 08:51
  • 1
    No pictures thanks. Just paste the text directly – Strawberry Nov 14 '15 at 09:41
  • @MuhammadMuazzam, that thread is useful and related, but does not answer the question of the op, since it deals with the fact that the results might be inconsistent since different comparison operators are used (MySQL bug if you ask me), however, that does not explain the performance issue here. It is probable that varchar comparisons are done, but I do not know that for sure and the other question does not imply it. – Lajos Arpad Nov 14 '15 at 10:36
  • Data type for p.id column? – jarlh Nov 14 '15 at 11:33
  • Performance issue get resolved at above when same data type will be used – Muhammad Muazzam Nov 14 '15 at 11:52
  • use numbers is faster than strings @LajosArpad – rickytan Nov 16 '15 at 01:51
  • sorry for that, I'm using a GUI software, so I can't copy the result as text @Strawberry – rickytan Nov 16 '15 at 01:52

2 Answers2

2

If you take a look at the MySQL-Documentation How MySQL uses indexes you will note the following statement:

Comparison of dissimilar columns (comparing a string column to a temporal or numeric column, for example) may prevent use of indexes if values cannot be compared directly without conversion. For a given value such as 1 in the numeric column, it might compare equal to any number of values in the string column such as '1', ' 1', '00001', or '01.e1'. This rules out use of any indexes for the string column.

So if you mix up data types in your IN-Statement it coould prevent the usage of an index.

MikeVe
  • 1,062
  • 8
  • 13
0

This query:

select `p`.`id` as `post_id`
from `Post` as `p` left join
     `Account` as `a` 
     on `a`.`id` = `p`.`author` 
where `p`.`id` in ('9', '10', '76', '77', 123) 
order by `p`.`id` asc;

Should take advantage of an index on Post(id) for the where clause. It might or might not be able to use the index for the order by.

Two reasons come to mind as to why an index on Post(id) would not be used. The first is type conversion. The in list consists of strings. My guess is that id is an integer. Sometimes, type conversions can preclude the use of an index. You can check if this is the case by changing the logic to:

where p.id in (9, 10, 76, 77, 123) 

The second reason is simply that the tables are so small. MySQL does take cardinality into account when choosing the execution plan. For small tables, a full-table scan can be more efficient than an index. In this case, MySQL should always use a covering index. If you want to test this possibility, create an index on Posts(id, author). This covers the query and should be preferred by the optimizer.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786