1

I have a table orders with the following fields:

order_id: varchar
user_id:  varchar
item_id:  varchar
datetime: timestamp
quantity: int

The following query would help me find the last item by each user_id where the quantity is > 10.

WITH ranked AS (
    SELECT *,
        row_number() OVER (PARTITION BY user_id ORDER BY datetime DESC)
    FROM orders
    WHERE quantity > 10
)
SELECT user_id, item_id, datetime FROM ranked where rn = 1

Now, instead of considering orders with quantity > 10, I want to find the same information but with different thresholds, like 20, 50, 100.

One option is to have another 3 different queries, one for each threshold. However, it looks not efficient. Is there a better query?

Thanks.

P/S: I will need one for SparkSQL or Postgres. However, generic SQL is fine.

Averell
  • 793
  • 2
  • 10
  • 21
  • 1
    Use a case expression to define groups to partition by, Having sample data and expected results to work with would be useful. – Stu Mar 20 '22 at 11:15
  • You are worried about efficiency even though you don't know how long it takes in the first place? – jjanes Mar 20 '22 at 15:18

0 Answers0