0

INPUT: this is the Order_Tbl table with each row equivalent to a record of a transaction:

ORDER_DAY ORDER_ID PRODUCT_ID QUANTITY PRICE
2015-05-01 ODR1 PROD1 5 5
2015-05-01 ODR2 PROD2 2 10
2015-05-01 ODR3 PROD3 10 25
2015-05-01 ODR4 PROD1 20 5
2015-05-02 ODR5 PROD3 5 25
2015-05-02 ODR6 PROD4 6 20
2015-05-02 ODR7 PROD1 2 5
2015-05-02 ODR8 PROD5 1 50
2015-05-02 ODR9 PROD6 2 50
2015-05-02 ODR10 PROD2 4 10

EXPECTED OUTPUT: the task is to write a T-SQL query to get products that was ordered on 02-May-2015 but not on any other days before that:

ORDER_DAY ORDER_ID PRODUCT_ID QUANTITY PRICE
2015-05-02 ODR6 PROD4 6 20
2015-05-02 ODR8 PROD5 1 50
2015-05-02 ODR9 PROD6 2 50

I did try 2 solutions with a same approach: using a (completely identical) subquery to get a list of distinct products that were ordered before 02-May-2015, and then somehow putting it after the NOT IN operator inside the WHERE clause of the main query.

Solution 1: the subquery was passed in as an CTE. It throws a syntax error ...

WITH CTE AS
(
    SELECT DISTINCT PRODUCT_ID
    FROM Order_Tbl
    WHERE ORDER_DAY < '2015-05-02'
)
SELECT *
FROM Order_Tbl
WHERE ORDER_DAY = '2015-05-02'
  AND PRODUCT_ID NOT IN CTE

Solution 2: the subquery was embedded into the WHERE clause of the main query. This worked!

SELECT *
FROM Order_Tbl
WHERE ORDER_DAY = '2015-05-02'
  AND PRODUCT_ID NOT IN (SELECT DISTINCT PRODUCT_ID
                         FROM Order_Tbl
                         WHERE ORDER_DAY < '2015-05-02')

What was the nuance that made SQL behave and return different results? I would appreciate it if you guys could give me a clear explanation as well as some useful notes for further SQL implementations.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 3
    You seem to believe you can write SQL like C# or something; `(NOT) IN {Table Name|Table Alias|CTE Name}` simply isn't valid syntax. A `(NOT) IN` expects either a list of scalar values or a subquery, both of which should go in parentheses. – Thom A May 10 '23 at 12:26
  • You did never use the CTE. (It's expected to be found in a FROM clause. E.g. something similar to `where PRODUCT_ID not in (select PRODUCT_ID from CTE)`.) – jarlh May 10 '23 at 12:29
  • @jarlh you can define a CTE and *not* use it in the following statement. *Why* you'd do that, I've no idea, but you *can*. – Thom A May 10 '23 at 12:30
  • @ThomA, I know. But I do expect someone defining a cte to also use it. – jarlh May 10 '23 at 12:31
  • As a side note, generally I avoid `NOT IN`, especially when using a subquery. `NOT IN` can result in (documented) behaviour that some users don't expect when dealing with `NULL` values. `NOT EXISTS` is often a "better" solution. – Thom A May 10 '23 at 12:34
  • 1
    Also `DISTINCT` is pointless in an `IN` or `EXISTS` subquery – Charlieface May 10 '23 at 13:43

2 Answers2

1

The task is to write a T-SQL query to get products that was ordered on 02-May-2015 but not on any other days before that

It has been explained by Thom A in the comments that NOT IN does not accept the first syntax you tried, and that, even with the right syntax, NOT EXISTS is in general preferable to NOT IN:

select *
from order_tbl o
where order_day = '2015-05-02'
  and not exists (
    select 1 
    from order_tbl o1 
    where o1.product_id = o.product_id and o1.order_day < o.order_day 
)

The query ensures that there is no row in the table for the same product and an erlier order date. For performance, consider an index on order_tbl(product_id, order_day).

But overall, it is probably simpler and more efficient to use window functions ; the subquery can just be replaced with a window min():

select *
from (
    select o.*, min(order_day) over(partition by product_id) min_order_day
    from order_tbl o
) o
where order_day = '2015-05-02' and min_order_day = order_day
GMB
  • 216,147
  • 25
  • 84
  • 135
0

In fact, CTE A CTE allows you to define a temporary named result set that available temporarily, you should treat it like a table and query and write your code in this way


select *
from Order_Tbl
where ORDER_DAY = '2015-05-02'
    and PRODUCT_ID not in (
        select distinct PRODUCT_ID
        from Order_Tbl
        where ORDER_DAY < '2015-05-02'
        )

Cte

abolfazl sadeghi
  • 2,277
  • 2
  • 12
  • 20