0

I need to select pairs of rows from a table that meet certain criteria.
Here's an SQL fiddle that demonstrates what I want:

http://sqlfiddle.com/#!2/4fc2f/16/0

This works well when the table is small--but the problem is my production table may have millions of rows and thousands of rows for a particular date. Query takes 14-15 seconds to run for a particular date now. How can I improve it?

MySQL 5.5

scotru
  • 2,563
  • 20
  • 37
  • Do you have indexes in place? Do a `show create table`. BTW 14 seconds for a query that accesses millions of rows is not *that* bad – Jim Jan 04 '14 at 22:15
  • Yes, indexes on all columns shown here (there are more columns in the table but I'm simplifying). A straight select for a particular date seems to happen almost instantly ( – scotru Jan 04 '14 at 22:25

2 Answers2

1

you are missing an index. Try using EXPLAIN to analyze your query, it will help you a lot.

The solution is simple, here it is: http://sqlfiddle.com/#!2/56deb/1/0

You need to add an index that includes the columns used in the where statement in order they are used:

KEY `night_of_2` (`night_of`,`student_id`,`check_class`)

Also you need to force the use of the index on the join, since you are joining the table to itself:

JOIN checks checks2

FORCE INDEX ( night_of_2 ) ON ( checks1.night_of = checks2.night_of )

(if there is a better way i would like to know about it) :)

Regards,

Sinisa Valentic
  • 552
  • 6
  • 24
  • 1
    There is nothing here to suggest that forcing an index is required (it's a very bad idea unless you absolutely have to). And with the addition of the index that we both suggested, forcing this index results in a less efficient query (more rows examined). – symcbean Jan 04 '14 at 22:39
  • 1
    Thanks for your help! It took nearly an hour to add the index, but the query is nice and quick now. I did omit the FORCE INDEX per symcbean's recommendation. – scotru Jan 04 '14 at 23:46
  • If you don't mind, i would be interested in the results of the query also with included force index (speed and or explain). It would be much appreciated :) – Sinisa Valentic Jan 05 '14 at 00:13
1

Your indexes aren't particularly effective - you can have more than one column in an index. But in order to get the right indexes you need to look at every query you run on the database, the frequency of each one, and the distribution of the data. Or are you just asking us to do your homework? Based on the information provided, the query would be more efficient with an index on (night_of, check_class, student_id) and lose the existing indexes apart from the PK.

symcbean
  • 47,736
  • 6
  • 59
  • 94
  • Thanks for taking time to answer my question. You are indeed correct about the problem with my indexes, but it seems that ordering the index (night_of, student_id, check_class) as advised by Sinisa is more effective based on my data distribution. Thanks also for the comment on omitting the FORCE INDEX. – scotru Jan 04 '14 at 23:53