1

I need a "little" help with an SQL query (MySQL).

I have the following tables:

COURIERS table:

    +------------+
    | COURIER_ID |
    +------------+

DELIVERIES table:

    +-------------+------------+------------+
    | DELIVERY_ID | COURIER_ID | START_DATE |
    +-------------+------------+------------+

ORDERS table:

    +----------+-------------+-------------+
    | ORDER_ID | DELIVERY_ID | FINISH_DATE |
    +----------+-------------+-------------+

COORDINATES table:

    +-------------+-----+-----+------+
    | DELIVERY_ID | LAT | LNG | DATE |
    +-------------+-----+-----+------+

In the real database I have more columns in each table, but for this example the above columns are enough.

What do I need?

  • An SQL query that returns all couriers [COURIER_ID], their last delivery [DELIVERY_ID] (based on last START_DATE), the delivery's last coordinate [LAT and LNG] (based on last DATE) and the remaining orders count (total of orders of the last delivery that have no FINISH_DATE).

  • A courier can have no deliveries, in this case I want DELIVERY_ID = NULL, LAT = NULL and LNG = NULL in the result.

  • A delivery can have no coordinates, in this case I want LAT = NULL and LNG = NULL in the result.

What was I able to do?

SELECT c.`COURIER_ID`,
       d.`DELIVERY_ID`,
       r.`LAT`,
       r.`LNG`,
  (SELECT COUNT(DISTINCT `ORDER_ID`)
   FROM `ORDERS`
   WHERE `DELIVERY_ID` = d.`DELIVERY_ID`
     AND `FINISH_DATE` IS NULL) AS REMAINING_ORDERS

FROM `COURIERS` AS c
LEFT JOIN `DELIVERIES` AS d USING (`COURIER_ID`)
LEFT JOIN `COORDINATES` AS r ON r.`DELIVERY_ID` = d.`DELIVERY_ID`

WHERE (CASE WHEN
         (SELECT MAX(`START_DATE`)
          FROM `DELIVERIES`
          WHERE `COURIER_ID` = c.`COURIER_ID`) IS NULL THEN d.`START_DATE` IS NULL ELSE d.`START_DATE` =
         (SELECT MAX(`START_DATE`)
          FROM `DELIVERIES`
          WHERE `COURIER_ID` = c.`COURIER_ID`) END)
  AND (CASE WHEN
         (SELECT MAX(`DATE`)
          FROM `COORDINATES`
          WHERE `DELIVERY_ID` = d.`DELIVERY_ID`) IS NULL THEN r.`DATE` IS NULL ELSE r.`DATE` =
         (SELECT MAX(`DATE`)
          FROM `COORDINATES`
          WHERE `DELIVERY_ID` = d.`DELIVERY_ID`) END)
GROUP BY c.`COURIER_ID`
ORDER BY d.`START_DATE` DESC

The problem is that this query is very slow (from 5 to 20 seconds) when I have over 5k COORDINATES and it does not returns all couriers sometimes.

Thank you so much for any solution.

Lucas NN
  • 758
  • 5
  • 16
  • You haven't really asked a question. It appears you're looking for help optimizing the query you already wrote. You should clarify your question. If you are seeking help to optimize the query, you should specify what your desired results are. – axiopisty Sep 05 '14 at 04:27
  • I believe it is a question because this query does not works like I need, regardless it's performance. Performance is just one of the problems. Thanks. – Lucas NN Sep 05 '14 at 04:31
  • Must you return all the results in a single database query? Or would it be okay for you to issue a couple queries to the database so long as the end result is the right data set? – axiopisty Sep 05 '14 at 04:36
  • It would also be helpful if you would include sample data in your tables along with the expected results. – axiopisty Sep 05 '14 at 04:38
  • Use max(date) may get multiple rows instead of the last one. – Jaugar Chang Sep 05 '14 at 04:38
  • @axiopisty For me the best way is using just one query, but if you can think in a good solution using more than one query I can change my code. – Lucas NN Sep 05 '14 at 04:40
  • @axiopisty I will try to get some sample data. Thanks. – Lucas NN Sep 05 '14 at 04:41
  • @JaugarChang Oh, good eye, I didn't realize it. – Lucas NN Sep 05 '14 at 04:42
  • Your can generate two subquerys of `delivery's last coordinates` and `courier's last deliveries` first. Than join them together. It should be more efficient. There is a solution of [getting first or last row in a specific group using MySql](http://stackoverflow.com/a/25592745/3630826) you can reference – Jaugar Chang Sep 05 '14 at 04:47
  • @axiopisty Why did you delete your answer? I would like to try both answers and choose the best one. =) – Lucas NN Sep 05 '14 at 05:18
  • I deleted it because it was incorrect. I forgot to include the remaining orders count. But now I corrected the answer and reposted it. It might still be wrong due to my interpretation of your question. It is not clear if the finish_date in question is the one on the orders table or the deliveries table. – axiopisty Sep 05 '14 at 05:21
  • @axiopisty Thanks. The `FINISH_DATE` in the `DELIVERIES` table is useless in this case. I will edit the question. – Lucas NN Sep 05 '14 at 05:27

2 Answers2

2

Try this:

SELECT C.COURIER_ID, D.DELIVERY_ID, D.START_DATE, D.FINISH_DATE, 
       B.LAT, B.LNG, B.DATE, C.NoOfOrders
FROM COURIERS C 
LEFT JOIN ( SELECT * 
            FROM (SELECT * 
                  FROM DELIVERIES D 
                  ORDER BY D.COURIER_ID, D.START_DATE DESC
                ) A 
            GROUP BY COURIER_ID
          ) AS A ON C.COURIER_ID = A.COURIER_ID 
LEFT JOIN ( SELECT * 
            FROM (SELECT * 
                  FROM COORDINATES CO 
                  ORDER BY CO.DELIVERY_ID, CO.DATE DESC
                ) B
            GROUP BY CO.DELIVERY_ID
          ) AS B ON A.DELIVERY_ID = B.DELIVERY_ID 
LEFT JOIN ( SELECT O.DELIVERY_ID, COUNT(1) NoOfOrders
            FROM ORDERS O WHERE FINISH_DATE IS NULL
            GROUP BY O.DELIVERY_ID
          ) AS C ON A.DELIVERY_ID = C.DELIVERY_ID;
Saharsh Shah
  • 28,687
  • 8
  • 48
  • 83
1

I haven't been able to test this query since I don't have a mysql database set up right now, much less with this schema and sample data. But I think this will work for you:

select
  c.courier_id
  , d.delivery_id
  , co.lat
  , co.lng
  , oc.cnt as remaining_orders
from
  couriers c
  left join (
    select
      d.delivery_id
      , d.courier_id
    from
      deliveries d
      inner join (
        select
          d.delivery_id
          , max(d.start_date) as start_date
        from
          deliveries d
        group by
          d.delivery_id
      ) dmax on dmax.delivery_id = d.delivery_id and dmax.start_date = d.start_date
  ) d on d.courier_id = c.courier_id
  left join (
    select
      c.delivery_id
      , c.lat
      , c.lng
    from
      coordinates c
      inner join (
        select
          c.delivery_id
          , max(c.date) as date
        from
          coordinates c
        group by
          c.delivery_id
      ) cmax on cmax.delivery_id = c.delivery_id and cmax.date = c.date
  ) co on co.delivery_id = d.delivery_id
  left join (
    select
      o.delivery_id
      , count(o.order_id) as cnt
    from
      orders o
    where
      o.finish_date is null
    group by
      o.delivery_id
  ) oc on oc.delivery_id = d.delivery_id
axiopisty
  • 4,972
  • 8
  • 44
  • 73