I have the below table (simplified) where there are approximately 2 million records in the table, with tickets.spam = FALSE
for the majority of records, and only 31 records with tickets.spam = TRUE
.
CREATE TABLE tickets (
id INT PRIMARY KEY,
account_id INT,
w_id INT,
spam BOOLEAN,
due_by DATE
);
The below query returns data in under 200ms.
SET SESSION query_cache_type = OFF;
SELECT SQL_NO_CACHE tickets.*
FROM tickets
WHERE tickets.account_id = 95
AND tickets.w_id IN (2, 3, 4, 5, 6)
AND tickets.spam = TRUE
LIMIT 25 OFFSET 0;
This is the query plan
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | tickets | p127 | ref | --- | index_acc_due_ws_id | 8 | const | 1049925 | 5.00 | Using index condition; Using where |
But if change the limit to 26 instead of 25 the execution time goes for 5 seconds.
FROM tickets
WHERE tickets.account_id = 95
AND tickets.w_id IN (2, 3, 4, 5, 6)
AND tickets.spam = TRUE
LIMIT 26 OFFSET 0;
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | tickets | p127 | ref | --- | index_acc_due_ws_id | 8 | const | 1049925 | 5.00 | Using index condition; Using where |
And index_acc_due_ws_id
is an index on account_id
, due_by
, w_id
, and id
in that order.
My understanding is when the LIMIT is greater than the available number of records it might default to a full table scan to fetch the required number of records. However, since we have 31 records I'm not sure why this happens at 26 records.
Also, how do we effectively work with columns that have a huge data skew?