2

I have an table with all our orders. The orders are linked to a person. And every person is attached to a company. Now I need a list of all companies have never ordered before a certain date. The query works fine, but it's very slow.

This is my T-SQL query:

SELECT
    DISTINCT p1.company_id
FROM
    order o
    JOIN person p1 ON (o.person_id = p1.id AND p1.company_id IS NOT NULL)
WHERE
    o.orderDate > '2017-01-01'
AND
    o.orderDate < '2017-09-01'
AND NOT EXISTS (SELECT
                    p2.company_id
                FROM
                    order o2
                    JOIN person p2 ON (o2.person_id = p2.id AND p2.company_id = p1.company_id)
                WHERE
                    o2.orderDate < '2017-01-01')

I've already changed it from a NOT IN to a NOT EXISTS. Since this was what most people here recommended. Didn't help much. A better index improved the situation a bit, but the query is still slow. I assume it's because for every order it has to do the sub-query.

This is the execution plan: https://www.brentozar.com/pastetheplan/?id=SyAlAU3db
For simplicity reasons I removed a few WHERE clauses in my example above)

The query runs on Azure SQL (and SQL Server Express 12 for development)

Anyone has a better idea on how to solve this?

Remy
  • 12,555
  • 14
  • 64
  • 104
  • 2
    questions seeking performance help should include DDL,DML Of the tables involved along with test data..if your test data is large,try scripting out schema and stats for the table(`right click database->generate scripts->select specific database objects->in next screen select advanced and choose Script statistics)` and paste it in question..With this info any one repro the same issue you are facing.Otherwise it becomes very difficult to answer your question .Pasting server version also helps – TheGameiswar Aug 24 '17 at 12:46
  • 1
    Could you show us current execution plan here: https://www.brentozar.com/pastetheplan/ – Ruslan K. Aug 24 '17 at 12:54
  • @Rammy One more simple hit for you. Did you try EXCEPT statement? There are two similar sub-sets, here you go. Moreover, if the previous note does not make sense, analyse an execution plan for 1) indexes 2) column types. – hastrb Aug 24 '17 at 12:56
  • Cheers, will try out the suggestions below and updated my questions – Remy Aug 24 '17 at 14:11

4 Answers4

1

If you have execution plan to share, it will help on performance analyse.

I made some change on query as below, you may try if that improve it

SELECT p1.company_id
FROM  order o
INNER JOIN person p1 
    ON (o.person_id = p1.id AND p1.company_id IS NOT NULL)
GROUP BY p1.company_id
HAVING SUM(CASE WHEN  (o.orderDate > '2017-01-01' AND  o.orderDate < '2017-09-01') THEN 1 ELSE 0 END) > 0
      AND
      SUM(CASE WHEN  orderDate < '2017-01-01' THEN 1 ELSE 0 END) = 0
EricZ
  • 6,065
  • 1
  • 30
  • 30
1

What's about this one. Hope I understood the task correctly.

(
  SELECT p1.company_id
  FROM order o
  JOIN person p1 
    ON o.person_id = p1.id 
  WHERE p1.company_id IS NOT NULL
    AND o.orderDate > '2017-01-01'
    AND o.orderDate < '2017-09-01'
)
EXCEPT
(
  SELECT p2.company_id
  FROM order o2
  JOIN person p2 
    ON o2.person_id = p2.id 
  WHERE p2.company_id IS NOT NULL
    AND o2.orderDate < '2017-01-01'
)
hastrb
  • 410
  • 4
  • 12
  • Just needs to be distinct. – cloudsafe Aug 24 '17 at 13:33
  • @cloudsafe It does not require redundant `DISTINCT`. It will be unique set without duplicates. – hastrb Aug 24 '17 at 13:36
  • Perhaps I am misunderstanding, but what if there are many orders for the same person? – cloudsafe Aug 24 '17 at 13:53
  • @cloudsafe the final result of EXCEPT is always unique. I believe that any redundant DISTINCT here even reduces performance. Look at simple example for you. [link](https://www.screencast.com/t/Afc0mhT3a) The result for both queries is only one record with 1. – hastrb Aug 24 '17 at 14:17
  • Cheers! This works correctly, but only reduced the query time (on my laptop) from 12s to about 11s :-( – Remy Aug 24 '17 at 15:33
0

I think this would do it (oops missed the have not ordered)

;With FirstOrders
as
(
    Select p1.company_id   
    ,      MIN(o.orderDate) as FirstCompanyOrder
    From Orders o 
    Join Person P1 on o.person_id = p1.id
    Group by P1.Company_id
    Having MIN(o.OrderDate) < '2017-01-01'
)

Select distinct o.company_id
From      Orders      o 
Left join FirstOrders FO on o.Company_id = FO.ComapnyId
where FO.company_id is null
Jay Wheeler
  • 379
  • 2
  • 7
0

May be this will help you:

WITH cte AS
(
    SELECT o.person_id, MIN(o.orderDate) minOrderDate
        FROM order o 
        GROUP BY o.person_id
)
SELECT DISTINCT p1.company_id
    FROM person p1
    JOIN cte ON cte.person_id = p1.id
    WHERE p1.company_id IS NOT NULL AND cte.minOrderDate > '2017-01-01' AND cte.minOrderDate < '2017-09-01';
Ruslan K.
  • 1,912
  • 1
  • 15
  • 18