17

I am working with a table that contains credit transactions where I want to display who's credits were spent when a sale is made.

In the table:

  • Credits are added by an entity using a unique entity code (recorded in column GivenByUserCode)
  • Credit additions always have such a code.
  • Credits that are spent will always have a negative value.
  • Credits that are spent will not have an entity code (value of GivenByUserCode is null).

Using the above data as an example if a user makes a purchase on 2018-01-02 the report should show all these credits originated from BM01. What add's complexity is that a purchase could be split over multiple additions, see the purchase on 2018-02-03 which is divided over 3 additions.

I think the solution will have something to do with using cte and over but I have no experience using these. I did find a similar (not same) problem on SqlServerCentral.

Any help / direction would be most appreciated.


Input and DDL

DECLARE @CreditLogs TABLE(CreditLogId int not null identity(1,1), Credits INT NOT NULL, OccurredOn DATETIME2(7) NOT NULL, GivenByUserCode VARCHAR(100) NULL)

INSERT INTO @CreditLogs (Credits, OccurredOn, GivenByUserCode) VALUES
  (10,  '2018-01-01', 'BM01')
, (10,  '2018-01-01', 'BM01')
, (-10, '2018-01-02', NULL)
, (-5,  '2018-01-04', NULL)
, (5,   '2018-02-01', 'SP99')
, (40,  '2018-02-02', 'BM02')
, (-40, '2018-02-03', NULL)
, (-4,  '2018-03-05', NULL)

Input in table form

CreditLogId | Credits | OccurredOn | GivenByUserCode
------------+---------+------------+----------------
          1 |      10 | 2018-01-01 |            BM01
          2 |      10 | 2018-01-01 |            BM01
          3 |     -10 | 2018-01-02 |            NULL
          4 |      -5 | 2018-01-04 |            NULL
          5 |       5 | 2018-02-01 |            SP99
          6 |      40 | 2018-02-02 |            BM02
          7 |     -40 | 2018-02-03 |            NULL
          8 |      -4 | 2018-03-05 |            NULL

Expected output

SELECT *
FROM (VALUES
     (3, '2018-01-02', 10, 'BM01')
    ,(4, '2018-01-04', 5, 'BM01')
    ,(7, '2018-02-03', 5, 'BM01')
    ,(7, '2018-02-03', 5, 'SP99')
    ,(7, '2018-02-03', 30, 'BM02')
    ,(8, '2018-03-05', 4, 'BM02')
) expectedOut (CreditLogId, OccurredOn, Credits, GivenByUserCode)

Produces output

CreditLogId | Occurred on | Credits | GivenByUserCode
------------+-------------+---------+----------------
          3 |  2018-01-02 |      10 |            BM01
          4 |  2018-01-04 |       5 |            BM01
          7 |  2018-02-03 |       5 |            BM01
          7 |  2018-02-03 |       5 |            SP99
          7 |  2018-02-03 |      30 |            BM02
          8 |  2018-03-05 |       4 |            BM02

Code so far

It's not much and I am not sure where to go from here.

WITH totals AS (
    SELECT CreditLogId, OccurredOn, credits, sum(credits) OVER(ORDER BY OccurredOn) AS TotalSpent
    FROM @CreditLogs
    WHERE Credits < 0
)
SELECT *
FROM totals

Additional clarification

The expected output is for each spent credit amount where those credits came from. Credits are spent in on a first in first out (FIFO) basis. Here an explanation of each value in the sample output in the hope that this clarifies the desired output.

  • For the spending of 10 credits (credit log id 3) can be traced back to an addition from credit log id 1
  • For the spending of 5 credits (credit log id 4) can be traced back to an addition from credit log id 2 (as credit log id 1 was "used up")
  • For the spending of 40 credits in credit log id 7 can be traced back to
    • Remainder of addition from credit log id 2, 5 credits
    • Credit log id 5 (addition of 5)
    • Credit log id 6 (addition of 40 so 10 remaining)
  • For the spending of 4 credits in credit log 8 the balance of credit log id 6 is used

Note that a total balance of 6 credits remains, the balance does not have to zero out but will never be in the negative as users can only spend what they have.

Igor
  • 60,821
  • 10
  • 100
  • 175
  • 1
    your expected output does not match with the input data, can you take a look if it's correct or explain how you arrived at the output? – Pankaj Gadge Apr 24 '18 at 18:16
  • @PankajGadge - The expected output is for each spent credit amount where those credits came from. Credits are spent in on a first in first out basis. I will update the question to try to make it more clear. – Igor Apr 24 '18 at 18:18
  • 1
    @PankajGadge - Please see the `Additional clarification` added at the end. Let me know if this still raises questions. – Igor Apr 24 '18 at 18:28
  • How many rows are in this table @Igor? I'm having a hard time not thinking of a RBAR method, but if the data is small it may not matter. – S3S Apr 24 '18 at 18:39
  • 1
    @scsimon - We anticipate growth of 4 to 5 million records per year in this table (*the table/history is never purged*). – Igor Apr 24 '18 at 18:43
  • Do you absolutely require a query that gets the information you need using this table format or are you willing to change the way you are storing the credits / add additional metadata? – Kevin Aud Apr 26 '18 at 23:25
  • Assuming this is actually done per-customer/user/whatever: How many rows per customer/user/whatever exist in average/max? – dnoeth Apr 29 '18 at 08:14
  • @dnoeth - It's done per order, not per customer. Maybe 60% of the time each order will have a 1=1 relationship with credits that were added, the rest of the time it is split over 2 or more additions. – Igor May 01 '18 at 10:26
  • 1
    Then Adam Silenko's answer is probably the best you can get as there's an additional join-condition based on equality and a small number of rows per order. – dnoeth May 01 '18 at 11:51

2 Answers2

10

try this:

WITH Credits_added AS (
    SELECT CreditLogId, OccurredOn, credits
    , SUM(credits) OVER (ORDER BY CreditLogId) - credits AS b --before
    , SUM(credits) OVER (ORDER BY CreditLogId) AS a --after
    , GivenByUserCode
    FROM @CreditLogs
    WHERE Credits > 0)
, Credits_spent AS (
    SELECT CreditLogId, OccurredOn, credits
    , SUM(credits) OVER (ORDER BY CreditLogId) * -1 + credits AS b
    , SUM(credits) OVER (ORDER BY CreditLogId) * -1 AS a
    FROM @CreditLogs
    WHERE Credits < 0)
SELECT s.CreditLogId, s.OccurredOn
, CASE WHEN a.a > s.a THEN s.a ELSE a.a END - CASE WHEN a.b > s.b THEN a.b ELSE s.b END AS Credits 
, a.GivenByUserCode
FROM Credits_added AS a
INNER JOIN Credits_spent AS s ON a.a > s.b AND s.a > a.b
uzi
  • 4,118
  • 1
  • 15
  • 22
Adam Silenko
  • 3,025
  • 1
  • 14
  • 30
  • Fantastic answer! I will mark it as such for now, I still have to vet it with some real data at the office next week. Thank you very much! – Igor Apr 27 '18 at 10:43
  • If you wana save output, and faster generate next report, then you can add to output select `, a.CreditLogId`, and use output to set used credits per row – Adam Silenko Apr 27 '18 at 16:00
  • 2
    `(ORDER BY CreditLogId)` should be changed to `(ORDER BY CreditLogId ROWS UNBOUNDED PRECEDING)`, otherwise it defaults to `RANGE UNBOUNDED PRECEDING` which tends to be less efficient (maybe not in this case when the `CreditLogId` is known to be unique to the optimizer). – dnoeth Apr 29 '18 at 17:48
6

since you mentioned there will be 4 to 5 million records per year, even if this can be done by query it will be an slow query.

I would suggest to have another table like creditSpent which contains (PurchaseCreditLogId , additionCreditLogId, Amount)

And in the time of inserting purchases, find all records, calculate the amount that should be reduced from each one and store that information in that table

Then when you are running your report you can do a simple query on this table

Reza
  • 18,865
  • 13
  • 88
  • 163