2

Howdie do,

I have the following SQL query which takes about 4 seconds to run:

select 
    o.id, tu.status_type, m.upload_date
from 
    (select order_id, max(last_updated) as maxudate from tracking_update group by order_id) t
inner join 
    tracking_update tu on t.order_id=tu.order_id and t.maxudate=tu.last_updated
right join 
    fgw247.order o on t.order_id=o.id
left join 
    manifest m on o.manifest_id=m.id
where 
    (m.upload_date >= '2015-12-12 00:00:00') or (m.upload_date <='2015-12-12 00:00:00' and tu.status_type != 'D' and tu.status_type != 'XD')

The query joins the following 3 tables:

Order, Manifest and Tracking_Update.

The query will return orders that meet the following criteria:

  1. Orders < 30 days with any delivery status
  2. Orders > 30 days and not a delivery status

An order is considered delivered if the latest tracking_update for that order has a status_type of 'D' or 'XD'

Now, the orders are listed in the Order table. The Order table has a column called manifest_id which references the Manifest that was created when the order was uploaded.

A manifest can have multiple orders. This is what is used to determine if an order has been uploaded in the past 30 days.

Finally, the tracking_update table contains the tracking_updates per order. An order can have multiple tracking_updates.

Currently, the tracking_update table is over 1M records long.

Listed below are the create statements for each table:

CREATE TABLE "order" (
  "id" int(11) NOT NULL AUTO_INCREMENT,
  "ShipmentId" varchar(50) DEFAULT NULL,
  "RecipientName" varchar(160) DEFAULT NULL,
  "CompanyName" varchar(160) DEFAULT NULL,
  "Address1" varchar(160) DEFAULT NULL,
  "Address2" varchar(160) DEFAULT NULL,
  "City" varchar(50) DEFAULT NULL,
  "State" varchar(3) DEFAULT NULL,
  "ZIP" int(11) DEFAULT NULL,
  "TEL" int(11) DEFAULT NULL,
  "Email" varchar(255) DEFAULT NULL,
  "Bottles" int(11) DEFAULT NULL,
  "Weight" float DEFAULT NULL,
  "Resi" tinyint(1) DEFAULT NULL,
  "Wave" int(11) DEFAULT NULL,
  "url_slug" varchar(50) DEFAULT NULL,
  "manifest_id" int(11) DEFAULT NULL,
  "shipment_date" datetime DEFAULT NULL,
  "tracking_number" varchar(30) DEFAULT NULL,
  "shipping_carrier" varchar(10) DEFAULT NULL,
  "last_tracking_update" datetime DEFAULT NULL,
  "delivery_date" datetime DEFAULT NULL,
  "customer_code" varchar(50) DEFAULT NULL,
  "sub_cust_code" int(11) DEFAULT NULL,
  PRIMARY KEY ("id"),
  UNIQUE KEY "ShipmentID" ("ShipmentId"),
  KEY "manifest_id" ("manifest_id"),
  KEY "order_idx3" ("tracking_number"),
  KEY "order_idx4" ("customer_code"),
  CONSTRAINT "order_ibfk_1" FOREIGN KEY ("manifest_id") REFERENCES "manifest" ("id")
);

Tracking_Update table:

CREATE TABLE "tracking_update" (
  "id" int(11) NOT NULL AUTO_INCREMENT,
  "order_id" int(11) DEFAULT NULL,
  "ship_update_date" datetime DEFAULT NULL,
  "message" varchar(400) DEFAULT NULL,
  "location" varchar(100) DEFAULT NULL,
  "status_type" varchar(2) DEFAULT NULL,
  "last_updated" datetime DEFAULT NULL,
  "hash" varchar(32) DEFAULT NULL,
  PRIMARY KEY ("id"),
  KEY "order_id" ("order_id"),
  KEY "tracking_update_idx2" ("status_type"),
  KEY "tracking_update_idx3" ("ship_update_date"),
  CONSTRAINT "tracking_update_ibfk_1" FOREIGN KEY ("order_id") REFERENCES "order" ("id")
);

Manifest table:

CREATE TABLE "manifest" (
  "id" int(11) NOT NULL AUTO_INCREMENT,
  "upload_date" datetime DEFAULT NULL,
  "name" varchar(100) DEFAULT NULL,
  "destination_gateway" varchar(40) DEFAULT NULL,
  "arrived" tinyint(1) DEFAULT NULL,
  "customer_code" varchar(50) DEFAULT NULL,
  "upload_user" varchar(50) DEFAULT NULL,
  "trip_id" int(11) DEFAULT NULL,
  PRIMARY KEY ("id")
);

Here is also the EXPLAIN on the select statement exported with JSON:

{
    "data":
    [
        {
            "id": 1,
            "select_type": "PRIMARY",
            "table": "m",
            "type": "ALL",
            "possible_keys": "PRIMARY",
            "key": null,
            "key_len": null,
            "ref": null,
            "rows": 220,
            "Extra": "Using where"
        },
        {
            "id": 1,
            "select_type": "PRIMARY",
            "table": "o",
            "type": "ref",
            "possible_keys": "manifest_id",
            "key": "manifest_id",
            "key_len": "5",
            "ref": "fgw247.m.id",
            "rows": 246,
            "Extra": "Using index"
        },
        {
            "id": 1,
            "select_type": "PRIMARY",
            "table": "tu",
            "type": "ref",
            "possible_keys": "order_id",
            "key": "order_id",
            "key_len": "5",
            "ref": "fgw247.o.id",
            "rows": 7,
            "Extra": "Using where"
        },
        {
            "id": 1,
            "select_type": "PRIMARY",
            "table": "<derived2>",
            "type": "ref",
            "possible_keys": "<auto_key0>",
            "key": "<auto_key0>",
            "key_len": "11",
            "ref": "fgw247.o.id,fgw247.tu.last_updated",
            "rows": 13,
            "Extra": "Using index"
        },
        {
            "id": 2,
            "select_type": "DERIVED",
            "table": "tracking_update",
            "type": "index",
            "possible_keys": "order_id",
            "key": "order_id",
            "key_len": "5",
            "ref": null,
            "rows": 1388275,
            "Extra": null
        }
    ]
}

Any help is appreciated

UPDATE

This is the current query that I'm using which is MUCH faster:

SELECT 
    o.*, tu.*
FROM
    fgw247.`order` o
JOIN
    manifest m
ON
    o.`manifest_id` = m.`id` 

JOIN
    `tracking_update` tu
ON
    tu.`order_id` = o.`id` and tu.`ship_update_date` = (select max(last_updated) as last_updated from tracking_update where order_id = o.`id` group by order_id)
WHERE
    m.`upload_date` >= '2015-12-14 11:50:12' 
    OR 
        (o.`delivery_date` IS NULL AND m.`upload_date` < '2015-12-14 11:50:12')
LIMIT 100
Jimmy
  • 887
  • 1
  • 10
  • 24

1 Answers1

1

Consdider the the following options to optimise the execution of your query:

  1. Join types: are you sure that you need the left and right joins? Do you want to report on orders that are not even included in a manifest? If not, then use inner join instead of left and right.

  2. Additional indexes: There is no index on manifest.upload_date field, you should add one. I would also create a composite index on order_id, update_date, and status_type fields (in this order!) on tracking_update table.

  3. Use union instead of the or criterion in where: Mysql is traditionally not really good at optimising or criterion in where clauses. So turn the (m.upload_date >= '2015-12-12 00:00:00') or (m.upload_date <='2015-12-12 00:00:00' and tu.status_type != 'D' and tu.status_type != 'XD') into a union with 2 queries. The 2 queries will be the same as the existing query up to the where part. The 1st query will have only the (m.upload_date >= '2015-12-12 00:00:00') condition in the where clause, while the 2nd one will have the (m.upload_date <='2015-12-12 00:00:00' and tu.status_type != 'D' and tu.status_type != 'XD') criteria.

If you add new indexes, then pls check if the query uses them by running a new explain.

Shadow
  • 33,525
  • 10
  • 51
  • 64
  • thank you for your response. Please see my updated query. I'm still not sure I'm understanding about the union. Are you saying create two different select statements? – Jimmy Jan 12 '16 at 16:16
  • You changed the joins to inner joins (this was my first suggestion), changed the where criteria (I cannot comment on that, since I do not know your data), and changed the subquery from derived table to value request. I cannot tell which change sped up the new version. – Shadow Jan 13 '16 at 11:10
  • I know that I prefer to have the derived table version over your choice because In case of a derived table the subquery is executed only once. With your version there is a risk that for each record the subquery is executed separately. However, if the joins would eliminate most of the records in the derived table, then the solution in your updated version can also be quick. I do not know your data, so cannot comment on that either. – Shadow Jan 13 '16 at 11:10