-1

I've a transaction table of purchased and returned items, and I want to match a return transaction with the transaction where that corresponding item was purchased. (Here I used the same item ID and amount in all records for simplicity)

trans_ID date item_ID amt type
1 2022-01-09 100 5000 purchase
2 2022-01-07 100 5000 return
3 2022-01-06 100 5000 purchase
4 2022-01-05 100 5000 purchase
5 2022-01-04 100 5000 return
6 2022-01-03 100 5000 return
7 2022-01-03 100 5000 purchase
8 2022-01-02 100 5000 purchase
9 2022-01-01 100 5000 return

Matching conditions are:

  1. The return date must be greater than or equal the purchase date
  2. The return and purchase transactions must relate to the same item's ID and same transaction amount
  3. For each return, there must be only 1 purchase matched to it (In case there are many related purchases, choose one with the most recent purchase date. But if the most recent purchase was already used for mapping with another return, choose the second-most recent purchase instead, and so on.)
  4. From 3), that means each purchase must be matched with only 1 return as well.

The result should look like this.

trans_ID date trans_ID_matched date_matched
2 2022-01-07 3 2022-01-06
5 2022-01-04 7 2022-01-03
6 2022-01-03 8 2022-01-02

This is what I've tried.

with temp as (
select  a.trans_ID, a.date
        , b.trans_ID as trans_ID_matched
        , b.date as date_matched
        , row_number() over (partition by a.trans_ID, a.date order by b.date desc) as rn1
from
(
    select *
    from transaction_table
    where type = 'return'
) a
inner join
(
    select *
    from transaction_table
    where type = 'purchase'
) b
on a.item_ID = b.item_ID and a.amount = b.amount and a.date >= b.date
)
select * from temp where rn = 1

But what I got is

trans_ID date trans_ID_matched date_matched
2 2022-01-07 3 2022-01-06
5 2022-01-04 7 2022-01-03
6 2022-01-03 7 2022-01-03

Here, the trans ID 7 shouldn't be used again in the last row as it has been already matched with trans ID 5 in the row 2. So is there any way to match trans ID 6 with 8 (or any way to tell SQL not to use the already-used one like the purchase 7) ?

  • So if the return date can be greater *and* equal to, how do you know ID 6 is *not* the return of purchase 7? – Stu Mar 29 '22 at 10:23
  • @Stu Actually ID 6 could be the return of purchase 7 as well. And if that's the case, ID 5 must be matched with purchase 8 since purchase 7 has been used. Basically the objective of this matching is just to find whether or not a return has any purchase that meets the mentioned criteria. If there's any that matches, we would treat both transactions (return and the matched purchase) as invalid and not include them when summing all valid transaction amounts. Sorry if I'm not being clear enough. – John Watts Mar 29 '22 at 16:09
  • In your example is your trans_ID accurate? Do your new events actually have a smaller ID than old events? – Derrick Moeller Mar 29 '22 at 16:32
  • @DerrickMoeller Not necessary. In real case trans ID is randomly created. – John Watts Mar 29 '22 at 17:16

1 Answers1

0

I created a fiddle, the result seem OK, but it's up to you to test if this is OK on all situtations.....

WITH cte as (
    SELECT 
        t1.trans_ID,
        t1.[date],
        t1.item_ID,
        t1.amt,
        t1.[type],
        pur.trans_ID   trans_ID_matched,
        pur.[date]     datE_matched,
        jojo.c
    FROM table1 t1
    CROSS APPLY (
       SELECT  
           trans_ID,
           item_ID,
           [date],
           amt
       FROM table1 pur
       WHERE pur.[type] = 'purchase'  and t1.[type]='return'
         and pur.item_ID = t1.item_ID
          and pur.amt = t1.amt
          and pur.[date] <= t1.[date]
    ) pur 
    CROSS APPLY (
       SELECt count(*) as c FROM table1 WHERE trans_ID> t1.trans_ID and trans_ID<pur.trans_ID
    ) jojo
    where jojo.c <=2
) 
select 
   trans_ID,
   [date],
   item_ID,
   amt,
   CASE WHEN min(c)=0 then min(trans_ID_matched) else max(trans_ID_matched) end
from cte
group by 
   trans_ID,
   [date],
   item_ID,
   amt
order by trans_ID;

DBFIDDLE

  • The count(*) detects the distance between the selected trans_ID from the return and the purchase.
  • This might go wrong the are more than 2 adjacent 'returns'... (I am afraid it will break, so I did not test this ).

But is's a nice problem. Hopefully this will give you any other ideas to find the correct sulution!

Luuk
  • 12,245
  • 5
  • 22
  • 33