2

Consider a MySql table having the following schema

+-------------------+------------+------+-----+-
| Field             | Type       | Null | Key | 
+-------------------+------------+------+-----+-
| id                | int(11)    | NO   | PRI | 
| user_id           | int(11)    | YES  | MUL | 
| following_user_id | int(11)    | NO   | MUL | 
+-------------------+------------+------+-----+-

Now i need queries like

select * from table where user_id = <x> and following_user_id = <y>;

and also

select * from table where following_user_id = <x> and user_id = <y>;

So I am considering composite indexes on the 2 columns like so:

index(user_id, following_user_id)

AND

index(following_user_id, user_id)

1) The indexes are created as desired, but will they work when the records are many (~ Millions) ?

2) Will the indexes speed up the queries, using the right index at the right time?

PS: I don't need sort/range selection queries, only direct match queries. Is there any better indexing scheme available for this requirement?

Arindam
  • 998
  • 1
  • 8
  • 20
  • You should extract an EXPLAIN of your queries, but I'm almost sure that only one index is needed and will handle both combinations by itself – Sebas Feb 21 '13 at 18:59
  • I do know the second query can be written like so: `select * from table where user_id = and following_user_id = `. but my business/app logic demands this redundancy – Arindam Feb 21 '13 at 19:00
  • 1
    The indexes should work. If you always use both fields in your where condition, you only would need one compound index. It would be used by both of your queries. The order of your WHERE clause shouldn't matter. – Tom Feb 21 '13 at 19:03
  • @Tom the order of `where` **does** matter and depends directly on the way the index is defined. See @Gordon's answer for a broader explanation – Arindam Feb 21 '13 at 19:04
  • @Arindam No, no and no. The way you write the `WHERE` does not matter. `WHERE user_id = AND following_user_id = AND` and `WHERE following_user_id = AND user_id = ` are 100% equivalent and will use any of the 2 indexes. – ypercubeᵀᴹ Feb 22 '13 at 12:51
  • @ypercube of course yes. I had thought, mistakenly, that Tom meant that the order of where doesn't matter in general with composite indexes. I get your exasperation :-) – Arindam Feb 22 '13 at 13:31
  • Well, it doesn't `WHERE (condition a) AND (condition b)` and `WHERE (condition b) AND (condition a)` will yield the same execution plan. Either you don't understand what I'm saying or I don't what you are (saying). – ypercubeᵀᴹ Feb 22 '13 at 13:43
  • Dude, read my comment in reply to Tom, I said "order of `where` does matter and depends on the index definition". Since in my question I have stated both versions of the index, hence in this case, it will not matter. Otherwise it **does**. – Arindam Feb 22 '13 at 13:54

1 Answers1

2

Your queries are the same from the perspective of the compiler. Either index will work. The order of the clauses in the where statement is immaterial to qualifying a query for an index.

However, if you have inequalities or only one clause, then the ordering in the index makes a difference.

So, the index index(user_id, following_user_id) would be useful for these situations:

  • any comparison directly on user_id (except <>)
  • user_id = XXX and folowing_user_id = YYY
  • user_id = XXX and folowing_user_id /IN values

It would not be used for:

  • folowing_user_id < YYY
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Yup i know. My meaning of asking this was whether the 2 indexes will function happily without any conflicts inside the index selection subroutine, whatever that is. – Arindam Feb 21 '13 at 19:02
  • @Arindam . . . Having both index should be fine. It is not necessary for the examples in your question, though. – Gordon Linoff Feb 21 '13 at 19:05
  • You could say, yeah. I am thinking removing the additional index and re-phrasing the queries to use the single index. – Arindam Feb 21 '13 at 19:06