Is there any difference (index selection? speed?) in find_in_set vs or

Question

Is there any difference (index selection? speed?) between:

select * from table where x = 'a' or x = 'b'

vs

select * from table where find_in_set(x, 'a,b')

Should I use one of those or are they equal?

Try it? Look at the explain? Personally I would guess that the `OR` is faster since it's a standard operation and doesn't involve a function, but I rarely understand the reasoning of the MySQL query planner. — Wolph, Sep 16 '13 at 10:03
`but I rarely understand the reasoning of the MySQL query planner`- that is why I am asking the pros! — Markus, Sep 16 '13 at 10:24

score 1 · Answer 1 · answered Jun 10 '14 at 08:46

1

FIND_IN_SET is a function call which should skip the index altogether.

You should consider using

SELECT * FROM `table` WHERE x IN ('a', 'b')

instead.

answered Jun 10 '14 at 08:46

Matteo Tassinari

18,121
8
60
81

IN's downside is it doesn't play well with prepared statements. Select * From table Where id IN(?) won't work. Select * From table Where FIND_IN_SET(id,?) will. – user2782001 Mar 08 '16 at 03:22
True! But you could possibly write an extension of the library to handle that case too. – Matteo Tassinari Mar 08 '16 at 22:13

Raffael Meier · Answer 2 · 2016-10-18T18:50:07.080

you should avoid FIND_IN_SET when ever possible! In most cases it is used when columns are not normalized (e.g. csv in a cell). IN is very much faster since it's able to use indexes (if index exists). a crazy example is using FIND_IN_SET(primary_index_column, "1,2,3") using the primary key as first argument. with primary_index_column IN (1,2,3) MySQL is using the primary key as index, which is lightning fast - versus a string function which is very slow. the difference is sometimes 1000x.

e.g. this query

SELECT DISTINCT 
a1.usi as usishow,
m1.m1g1
FROM a1 LEFT JOIN p1 ON (a1.usi = p1.id)

left join m1 on (m1.m1id = p1.emxid) WHERE a1.umi=99999 AND p1.adi = 0 AND p1.id IN (13700325,13700273,13692054,13691440,13692237,13691503,13691512,13691621,13691632,13691653,13691659,13691682,13691702,13691705,13691720,13691723,13691728,13691763,13691782,13691786,13691815,13691818,13691826,13691828,13691830,13691906,13691831,13691843,13691846,13691848,13691849,13691860,13691873,13691881,13691893,13691945,13691939,13691942,13691977,13692043,13691995,13691996,13692004,13692006,13692011,13692012,13692033,13692090,13692042,13692055,13692062,13692225,13692072,13692089,13692111,13692122,13692142,13692156,13692167,13692170,13692173,13692176,13692178,13692183,13692187,13692193,13692202,13692208,13692217,13692216,13692236,13692238,13692239,13692260,13692274,13692275,13692276,13692280,13692285,13692289,13692291,13692293,13692299,13692305,13692313,13692338,13692348,13692355,13692356,13692365,13692370,13692372,13692376,13692696,13692707,13692701,13692708,13692715,13692727,13692734,13692733,13692735,13692758,13692761,13692762,13692781,13692851,13692854,13694572,13697874,13697878,13697885,13697891,13697897,13697898,13697903,13697904,13697912,13697914,13699899,13697949,13697971,13697976,13698138,13698775,13699851,13699854,13699858,13699859,13699880,13699905,13699908,13699909,13699918,13699932,13699936,13699950,13699959,13699971,13700298,13700162,13700163,13700174,13700177,13700195,13700201,13700238,13700248,13700267,13700276,13700303,13700393,13700399,13700410) ORDER BY usishow

(containing the three tables a1, m1 and p1. p1.id is the primary index of p1.) imagine using FIND_IN_SET(p1.id, 13700325,13700273,...) > 0 using no index, results in 14 sec (!) execution time. versus

p1.id IN (13700325,13700273,...)

using primary index of p1, results in 0.02 sec (!) execution time.

so, test and EXPLAIN (EXPLAIN SELECT ....) all queries using FIND_IN_SET with columns that contain single values! It's worth the efford comparing with IN (...) - in most of the cases, you save a lot of time, memory (mayby disk reads) and cpu power!

score -1 · Answer 3 · answered Sep 16 '13 at 10:09

According to my understanding,

select * from table where find_in_set(x, 'a,b')

Would work faster because it searches a value against a list of values and skips the search as soon as it finds a match. But OR works for all the values in the column.

Correct me if i am wrong.

Is there any difference (index selection? speed?) in find_in_set vs or

3 Answers3

Linked