I have table with over 1.5 billion of rows, high load project. I want to find duplicates by 3 columns using this query:
SELECT v.product_id
, v.value_id
, COUNT(*) counter
FROM ann2_product_value AS v
GROUP
BY v.product_id
, v.value_id
, v.set
HAVING counter > 1
ORDER
BY v.product_id
, v.value_id
LIMIT 20;
and the create table code is:
CREATE TABLE `ann2_product_value` (
`product_id` bigint(20) unsigned NOT NULL,
`value_id` int(11) NOT NULL,
`index` int(11) NOT NULL DEFAULT '0',
`set` int(11) NOT NULL,
`fitment` enum('yes','no') NOT NULL DEFAULT 'yes',
PRIMARY KEY (`product_id`,`value_id`,`index`,`set`,`fitment`),
KEY `value_id` (`value_id`),
KEY `product_fitment` (`product_id`,`fitment`),
CONSTRAINT `ann2_product_value_ibfk_1` FOREIGN KEY (`product_id`) REFERENCES `xcart_products` (`productid`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `fk_value_id_to_value` FOREIGN KEY (`value_id`) REFERENCES `ann2_value` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 STATS_PERSISTENT=1 STATS_AUTO_RECALC=0 ROW_FORMAT=DYNAMIC
The problem is that query above is very and very slow and explain of query showed me these results:
+----+-------------+----------+-------+----------------------------------+----------+---------+------+------------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+----------------------------------+----------+---------+------+------------+----------------------------------------------+
| 1 | SIMPLE | prod_val | index | PRIMARY,value_id,product_fitment | value_id | 4 | NULL | 1499189079 | Using index; Using temporary; Using filesort |
+----+-------------+----------+-------+----------------------------------+----------+---------+------+------------+----------------------------------------------+
And main thing I want is skip first record and select after the first record duplicates. How can I do this? Plesae help. Thanks.