2

I've got two tables (NewProducts and OldProducts) that are being compared. NewProducts has about 68,000 records and OldProducts about 51,000. I'm using a covering index on each table, however the query is taking 20 minutes to execute, so I'm not using it properly. Does a covering index really apply with multiple tables? What am I doing wrong? Thank you.

Here is my query code and the indexes:

$querystring = "SELECT newProducts.Id, newProducts.SKU,
  newProducts.Title, oldProducts.Title, oldProducts.product_Id
        FROM
  newProducts, oldProducts
        WHERE
    trim(newProducts.SKU)=trim(oldProducts.SKU) and
    trim(newProducts.Title)=trim(oldProducts.Title) and
    oldProducts.Position=1 and
    oldProducts.Customer=$shop";


Indexes for NewProducts:
Primary: Id
Index:   SKU, Title, customer (not unique)

Indexes for OldProducts:
Primary: Id
Index: Product_id (not unique)
Index: SKU, Title, Postition, Customer (not unique)

?>

CREATE TABLE `NewProducts` (
`Id` bigint(11) NOT NULL,
`Title` varchar(120) COLLATE utf8_unicode_ci NOT NULL,
`Category` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`Office` varchar(150) COLLATE utf8_unicode_ci NOT NULL,
`Rehashed` smallint(6) NOT NULL,
`Quantity` smallint(6) NOT NULL,
`Price1` decimal(7,2) NOT NULL,
`Price2` decimal(7,2) NOT NULL,
`Price3` decimal(7,2) NOT NULL,
`Price4` decimal(7,2) NOT NULL,
`created_at` varchar(30) COLLATE utf8_unicode_ci DEFAULT NULL,
`OldQuantity` int(11) NOT NULL,
`SKU` varchar(55) COLLATE utf8_unicode_ci DEFAULT NULL,
`Source` varchar(12) COLLATE utf8_unicode_ci NOT NULL,
`customer` varchar(70) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`Id`),
UNIQUE KEY `I-T-S` (`ItemId`,`Title`,`SKU`),
KEY `customer` (`customer`),
KEY `Title` (`Title`,`Rehashed`),
KEY `SKU` (`SKU`),
KEY `Title_2` (`Title`,`SKU`,`customer`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci  

 CREATE TABLE `OldProducts` (
 `barcode` varchar(55) COLLATE utf8_unicode_ci DEFAULT NULL,
 `compare_at_price` decimal(10,2) DEFAULT NULL,
 `created_at` varchar(30) COLLATE utf8_unicode_ci DEFAULT NULL,
 `fulfillment` varchar(35) COLLATE utf8_unicode_ci DEFAULT NULL,
 `grams` decimal(10,2) DEFAULT NULL,
 `id` bigint(11) NOT NULL,
 `management` varchar(55) COLLATE utf8_unicode_ci DEFAULT NULL,
 `policy` varchar(55) COLLATE utf8_unicode_ci DEFAULT NULL,
 `size` varchar(55) COLLATE utf8_unicode_ci DEFAULT NULL,
 `color` varchar(55) COLLATE utf8_unicode_ci DEFAULT NULL,
 `type` varchar(55) COLLATE utf8_unicode_ci DEFAULT NULL,
 `position` int(11) DEFAULT NULL,
 `price` varchar(15) COLLATE utf8_unicode_ci DEFAULT NULL,
 `product_id` bigint(11) NOT NULL,
 `SKU` varchar(55) COLLATE utf8_unicode_ci DEFAULT NULL,
 `Title` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
 `quantity` int(11) DEFAULT NULL,
 `customer` varchar(70) COLLATE utf8_unicode_ci NOT NULL,
 PRIMARY KEY (`id`),
 UNIQUE KEY `P-S-T-PO-CUST`      
 (`product_id`,`SKU`,`Title`,`position`,`customer`),
 KEY `product_id` (`product_id`),
user3314053
  • 239
  • 1
  • 3
  • 11
  • What is SKU? I don`t understand this field – ka_lin Sep 05 '15 at 18:01
  • You should showcase your existing table schemas, and the `CREATEd INDEXes`, perhaps even show a content excerpt. Also use `EXPLAIN` to reveal the query plan. It seems you should clean up the stored IDs and use a real JOIN instead of the WHERE clause mapping. (Though our SQL "people" can often infer such things, it's more useful to future users if you include as much detail as possible.) – mario Sep 05 '15 at 18:03
  • Difficult to do.... Obviously, I'm using aliases here. Basically one table has already been being converted to the other, and I'm looking for matching records to verify something else. It works, but it's slower than heck and I'm trying to speed it up obviously. What is different is that under 10,000 records in each it runs very quickly. Not sure why it slows down here. – user3314053 Sep 05 '15 at 18:16

1 Answers1

1

TRIM is the villain. When you hide an indexed column (eg, SKU) inside a function (eg, TRIM), the the index cannot be used.

Clean up your data:

  1. Fix the insertion code to TRIM before inserting (or as it inserts).
  2. UPDATE tbl SET SKU = TRIM(SKU), title = TRIM(title); -- for each table
  3. Change the SELECT: TRIM(SKU) --> SKU etc.

Even Better

oldProducts should have, in this order,

`INDEX(`position`,`customer` ,`SKU`,`Title`, `product_id`)

With this, the WHERE need look only at old rows for position=1 and customer =.... (Actually, the first 2 columns can be in any order; the last 3 in any order.)

Rick James
  • 135,179
  • 13
  • 127
  • 222