2
SELECT COUNT(*) AS cnt
FROM products
WHERE ExternalProductId IS NOT NULL
GROUP BY SourceId, ExternalProductId
HAVING cnt > 1

There is an index on (ExternalProductId, SourceId, AnotherField). An Explain shows the index is used. This is printed in the "Extra" column of explain:

Using where; Using index; Using temporary; Using filesort

When I run the query I see via SHOW PROCESSLIST:

Copying to tmp table on disk

Can I tweak this query to work in place on the index? I also don't mind if the results I get are slightly inaccurate because of other processes simultaneously working on this table - can I change the isolation level to increase the query's performance?

Riedsio
  • 9,758
  • 1
  • 24
  • 33
ripper234
  • 222,824
  • 274
  • 634
  • 905
  • 2
    I think you have some misconceptions. A temp table doesn't mean its copying all the data from the table to do its work. It only means its copying some intermediate result to a temp table for further processing. The temp table probably contains all records where the ExternalProductID is not null and the cnt column so it can evaluate the 'HAVING' clause. Second, transaction isolation has nothing to do with temp tables. – bot403 Dec 16 '10 at 15:37
  • @bot403 - I just talked to our DBA, and he agrees with you. I believe your comment is actually the best answer so far to this question. – ripper234 Dec 16 '10 at 17:40
  • Yes, I think the answer is both my #1 and ajreal's index suggestion. – bot403 Dec 16 '10 at 19:04
  • @bt403 - actually, take a look at @Riedsio's answer, I think he's got something there. – ripper234 Dec 17 '10 at 01:09

2 Answers2

3

If you reverse the columns in your GROUP BY to correspond with the ordering of the first two fields of your compound index, it'll use your compound index much more effectively.

SELECT COUNT(*) AS cnt
FROM products
WHERE ExternalProductId IS NOT NULL
GROUP BY ExternalProductId, SourceId
HAVING cnt > 1

Your query execution plain should turn into 'Using where; Using index', and get rid of both the temporary table and filesort caused by the other GROUP BY.

You'll still get the same results, but it'll be in a slightly different order.

Riedsio
  • 9,758
  • 1
  • 24
  • 33
  • Looks good, the explain really doesn't show a temp table now. I'll measure the query runtime to see if there's a difference. – ripper234 Dec 17 '10 at 01:09
0

A couple things to try:

  1. MySQL will automatically sort with a group by. If you don't care about sort order add an 'ORDER BY NULL' clause. That will take out the filesort and possibly the temp table.

  2. Remove the count(*) and use the name of a column in the index instead of the wildcard.

Also. What is your index? Can you show us the full table create statement?

bot403
  • 2,132
  • 15
  • 14