0

I have a query like the one shown below:

select count(test.id) from table1 
   inner join table2 on table1.id = table2.id
   where    (table2.email = 'test@gmail.com' 
   OR (table2.phone1 IS NOT NULL AND table2.phone1 in ('123456')) 
   OR (table2.phone2 IS NOT NULL AND table2.phone2 in ('1234456'))) 
   AND table2.id <> 1234 
   AND table2.created_at >= '2015-10-10' 
   AND table2.status NOT IN ('test') 
   AND table2.is_test = 'No';

I have an index on table2.email, table2.phone1, table2.phone2, table2.created_at. These are all single indexed and not composite indexes. As far as I know, a composite index on (table2.email, table2.phone1, table2.phone2) would not work because the conditions are OR conditions. I created a composite index on (table2.id, table2.created_at, table2.status, table2.is_test) but I got the same result in the explain query. The explain query is shown below

id  select_type table   type    possible_keys key     key_len ref rows Extra
 1  SIMPLE     table2   range   PRIMARY,     created_at  8        293  Using where
                                created_at,
                                email,
                                phone1,
                                phone2, 
                                com_index       
  1   SIMPLE    table1  eq_ref  PRIMARY       PRIMARY    4    id   1  Using index

Here com_index is the composite index I created. How can I create an index to speed this query up. It looks like from the explain result, the key selected for the query is created_at. Is there a way I can create a composite index for table 2? Please help me. Thanks in advance.

EDIT: explain for production on this query:

+----+-------------+-------+--------+----------------------------------------------+---------+---------+----------------------+--------+-------------+
| id | select_type | table | type   | possible_keys                                | key     | key_len | ref                  | rows   | Extra       |
+----+-------------+-------+--------+----------------------------------------------+---------+---------+----------------------+--------+-------------+
|  1 | SIMPLE      | l0_   | range  | PRIMARY,created_at,email,day_phone,eve_phone | PRIMARY | 4       | NULL                 | 942156 | Using where |
|  1 | SIMPLE      | m1_   | eq_ref | PRIMARY                                      | PRIMARY | 4       | lead_platform.l0_.id |      1 | Using index |
+----+-------------+-------+--------+----------------------------------------------+---------+---------+----------------------+--------+-------------+
Shikhar Subedi
  • 606
  • 1
  • 9
  • 26
  • You've got a bunch of constant conditions on `table2`, so you might get better results by joining on a materialized query which has pre-filtered candidates. Though, honestly, I'm not seeing why this query is particularly inefficient given the match size (293x1 rows). – bishop Nov 01 '16 at 19:32
  • @bishop this is in local environment, not in production. In production the table2 is much larger. – Shikhar Subedi Nov 01 '16 at 19:38
  • Ah. Can you please show an `EXPLAIN` on your production data? – bishop Nov 01 '16 at 23:22

2 Answers2

2

Genrally, MySQL can't use an index on 3 possible columns (email, phone1, phone2).

I suspect the rest of your conditions are not very specific and will not give great results on your production database (meaning, most of the items are not "Test" etc.).

Optimizing a query with an OR statement across multiple columns is tricky.

This article shows that splitting such a query to multiple queries with UNION can be much faster. In your case, it would be 3 queries combined, with no OR statements. This way MySQL can perform an index merge - using the indexes on email, phone1 and phone2.

Test it and let me know if it's faster on real data.

SELECT COUNT(DISTINCT(t.id)) FROM (
    SELECT test.id FROM table1 
       INNER JOIN table2 on table1.id = table2.id
       WHERE table2.email = 'test@gmail.com' 
       AND table2.id <> 1234 
       AND table2.created_at >= '2015-10-10' 
       AND table2.status NOT IN ('test') 
       AND table2.is_test = 'No'

    UNION ALL 

    SELECT test.id FROM table1 
       INNER JOIN table2 on table1.id = table2.id
       WHERE table2.phone1 IS NOT NULL AND table2.phone1 in ('123456')
       AND table2.id <> 1234 
       AND table2.created_at >= '2015-10-10' 
       AND table2.status NOT IN ('test') 
       AND table2.is_test = 'No'

    UNION ALL   

    SELECT test.id FROM table1 
       INNER JOIN table2 on table1.id = table2.id
       WHERE table2.phone2 IS NOT NULL AND table2.phone2 in ('1234456')
       AND table2.id <> 1234 
       AND table2.created_at >= '2015-10-10' 
       AND table2.status NOT IN ('test') 
       AND table2.is_test = 'No') AS t
Galz
  • 6,713
  • 4
  • 33
  • 39
1

Do not splay an array of things across columns. In this case, have a table of phone numbers with links back to table1. That table would have 0, 1, 2, or even more numbers for a given table1 type person. Then no OR or UNION is needed. Instead a JOIN is needed.

You have status and is_test; these seem redundant. Having lots of flags is not helpful; can these be combined?

I don't understand doing a lookup by email OR phone. Seems like you would have one or the other value, not both. In that case, construct the WHERE clause (and JOIN), then there won't be an OR for this.

After all that, have INDEX(phone) in the new table and INDEX(email) in table1. Since email and phone are probably very selective, I would not add status, etc, to the indexes.

Rick James
  • 135,179
  • 13
  • 127
  • 222