2

I'm writing a small budget app for myself, and am a bit stuck with a query.

It's a query to calculate the monthly budget totals, per month, per category.

On each month, you can set a rule against the category to either affect the main budget buffer, or be confined to that category only.

How it works is:

  • If it affects buffer, any money you overspend, resets each month for that category.
  • If it affects buffer, any money you save, is stored against that category for later use.
  • If it's confined, then it always (even if you overspend) keeps counting it in the next month, until you set it back to affects buffer.

    Note: The last negative amount should be counted in the first occurance of affectsbuffer, so that you can pay off any negative amount for that month. So the negative stops getting carried over AFTER the first affects buffer month.

Here is my sample data, along with a Balance Should Be column to show how it is meant to work.

CREATE TABLE [BGT].[BudgetTemp2] ( [Month] date, [CategoryID] int, [Budgeted] money, [Outflows] money, [BudgetedAndOutflows] money, [OverspendingHandling] nvarchar(50), [BalanceShouldBe] money )
INSERT INTO [BGT].[BudgetTemp2]
VALUES
( N'2016-01-01T00:00:00', 116, 0.0000, -500.0000, -500.0000, N'AffectsBuffer', -500.0000 ), 
( N'2016-02-01T00:00:00', 116, 50.0000, -200.0000, -150.0000, N'AffectsBuffer', -150.0000 ), 
( N'2016-03-01T00:00:00', 116, 50.0000, 0.0000, 50.0000, N'AffectsBuffer', 50.0000 ), 
( N'2016-04-01T00:00:00', 116, 0.0000, -350.0000, -350.0000, N'AffectsBuffer', -300.0000 ), 
( N'2016-05-01T00:00:00', 116, 100.0000, 0.0000, 100.0000, N'AffectsBuffer', 100.0000 ), 
( N'2016-06-01T00:00:00', 116, 0.0000, 10699.8900, 10699.8900, N'AffectsBuffer', 10799.8900 ), 
( N'2016-07-01T00:00:00', 116, 4147.8800, -16707.6900, -12559.8100, N'Confined', -1759.9200 ), 
( N'2016-08-01T00:00:00', 116, 0.0000, -4504.9600, -4504.9600, N'Confined', -6264.8800 ), 
( N'2016-09-01T00:00:00', 116, 0.0000, -5486.5400, -5486.5400, N'Confined', -11751.4200 ), 
( N'2016-10-01T00:00:00', 116, 0.0000, -3795.7700, -3795.7700, N'Confined', -15547.1900 ), 
( N'2016-11-01T00:00:00', 116, 0.0000, 407.3200, 407.3200, N'Confined', -15139.8700 ), 
( N'2016-12-01T00:00:00', 116, 298324.5900, -282434.7200, 15889.8700, N'Confined', 750.0000 ), 
( N'2017-01-01T00:00:00', 116, 4196.4400, -4196.4400, 0.0000, N'Confined', 750.0000 ), 
( N'2017-02-01T00:00:00', 116, 10999.9000, -15199.9000, -4200.0000, N'Confined', -3450.0000 ), 
( N'2017-03-01T00:00:00', 116, 4987.6600, -2875.1800, 2112.4800, N'Confined', -1337.5200 ), 
( N'2017-04-01T00:00:00', 116, 4899.1600, -65100.0000, -60200.8400, N'Confined', -61538.3600 ), 
( N'2017-05-01T00:00:00', 116, 504.3200, 0.0000, 504.3200, N'Confined', -61034.0400 ), 
( N'2017-06-01T00:00:00', 116, 0.0000, -104505.0300, -104505.0300, N'Confined', -165539.0700 ), 
( N'2017-07-01T00:00:00', 116, 0.0000, -72317.7100, -72317.7100, N'Confined', -237856.7800 ), 
( N'2017-08-01T00:00:00', 116, 0.0000, -82.2200, -82.2200, N'Confined', -237939.0000 ), 
( N'2017-09-01T00:00:00', 116, 237916.0900, -814.4600, 237101.6300, N'Confined', -837.3700 ), 
( N'2017-10-01T00:00:00', 116, 906.8300, -1523.5500, -616.7200, N'Confined', -1454.0900 ), 
( N'2017-11-01T00:00:00', 116, 175.6100, -3348.5500, -3172.9400, N'Confined', -4627.0300 ), 
( N'2017-12-01T00:00:00', 116, -14.4400, -1763.4400, -1777.8800, N'AffectsBuffer', -6404.9100 ), 
( N'2018-01-01T00:00:00', 116, 40.0000, -20.0000, 20.0000, N'AffectsBuffer', 20.0000 ), 
( N'2018-02-01T00:00:00', 116, 0.0000, -75.4300, -75.4300, N'AffectsBuffer', -55.4300 ), 
( N'2018-04-01T00:00:00', 116, 4899.7400, -4899.7400, 0.0000, N'AffectsBuffer', 0.0000 ), 
( N'2018-05-01T00:00:00', 116, 750.3900, -750.3900, 0.0000, N'AffectsBuffer', 0.0000 ), 
( N'2018-06-01T00:00:00', 116, 0.0000, -500.0000, -500.0000, N'Confined', -500.0000 ), 
( N'2018-07-01T00:00:00', 116, 100.0000, 0.0000, 0.0000, N'Confined', -400.0000 ), 
( N'2018-08-01T00:00:00', 116, 200.0000, -100.0000, 100.0000, N'Confined', -300.0000 ), 
( N'2018-09-01T00:00:00', 116, 0.0000, 0.0000, 0.0000, N'AffectsBuffer', -300.0000 ), 
( N'2018-10-01T00:00:00', 116, 100.0000, -50.0000, 50.0000, N'AffectsBuffer', 50.0000 ), 
( N'2018-11-01T00:00:00', 116, 0.0000, -500.0000, -500.0000, N'AffectsBuffer', -450.0000 ), 
( N'2018-12-01T00:00:00', 116, 100.0000, -50.0000, 50.0000, N'AffectsBuffer', 50.0000 ), 
( N'2019-01-01T00:00:00', 116, 0.0000, 0.0000, 0.0000, N'AffectsBuffer', 50.0000 ), 
( N'2019-02-01T00:00:00', 116, 100.0000, 0.0000, 100.0000, N'Confined', 150.0000 ), 
( N'2019-03-01T00:00:00', 116, 0.0000, -50.0000, -50.0000, N'Confined', 100.0000 ), 
( N'2019-04-01T00:00:00', 116, 0.0000, -200.0000, -200.0000, N'Confined', -100.0000 ), 
( N'2019-05-01T00:00:00', 116, 0.0000, -200.0000, -200.0000, N'Confined', -300.0000 ), 
( N'2019-06-01T00:00:00', 116, 0.0000, 0.0000, 0.0000, N'AffectsBuffer', -300.0000 ), 
( N'2019-07-01T00:00:00', 116, 0.0000, 0.0000, 0.0000, N'AffectsBuffer', 0.0000 )

--DROP TABLE [BGT].[BudgetTemp2]

Here is my query so far, but as you may see, it ended up in a bit of a mess. I feel that I have everything in here to be able to do it correctly, I'm just missing a few key things. It's counting correctly until I have the changes between Affects Buffer, and Confined.

-- FROM: https://stackoverflow.com/a/23020788
;WITH c1 AS (
    SELECT
        *,
        LAG([BudgetTemp].[OverspendingHandling], 1, [BudgetTemp].[OverspendingHandling]) 
            OVER (PARTITION BY [BudgetTemp].[CategoryID] ORDER BY [BudgetTemp].[Month]) AS PrevOverspendingHandling
    FROM [BGT].[BudgetTemp2] [BudgetTemp]
), c2 AS (
    SELECT
        *,
        SUM(CASE WHEN [c1].[OverspendingHandling] <> [c1].[PrevOverspendingHandling] THEN 1 ELSE 0 END)
            OVER (PARTITION BY [c1].[CategoryID] ORDER BY [c1].[Month]) AS Ranker
    FROM [c1]
), c3 AS (
    SELECT
        *,
        SUM([c2].[BudgetedAndOutflows]) OVER (PARTITION BY [c2].[CategoryID] ORDER BY [c2].[Month] ROWS UNBOUNDED PRECEDING) AS rt,
        SUM([c2].[BudgetedAndOutflows]) OVER (PARTITION BY [c2].[CategoryID], [c2].[Ranker] ORDER BY [c2].[Month] ROWS UNBOUNDED PRECEDING) AS rt2
    FROM [c2]
), c4 AS (
    SELECT
        *,
        MIN(rt) OVER (PARTITION BY [c3].[CategoryID] ORDER BY [c3].[Month] ROWS UNBOUNDED PRECEDING) AS rt_min,
        MIN(rt) OVER (PARTITION BY [c3].[CategoryID], [c3].[Ranker] ORDER BY [c3].[Month] ROWS UNBOUNDED PRECEDING) AS rt2_min
    FROM [c3]
), c5 AS (
    SELECT
        *,
        -- WE WANT TO LAG MIN_CUR BY 1 SO THAT WE STILL GET A SINGLE NEGATIVE FOR THAT MONTH, BUT IT RESETS THE NEXT MONTH
        LAG(rt_min, 1, 0) OVER (PARTITION BY [c4].[CategoryID] ORDER BY [c4].[Month]) AS rt_min_lag,
        LAG(rt2_min, 1, 0) OVER (PARTITION BY [c4].[CategoryID], [c4].[Ranker] ORDER BY [c4].[Month]) AS rt2_min_lag
    FROM [c4]
)
SELECT
    [Month],
    [CategoryID],
    [Budgeted],
    [Outflows],
    [BudgetedAndOutflows],
    [OverspendingHandling],
    [PrevOverspendingHandling],
    [Ranker],
    [rt],
    [rt_min],
    [rt_min_lag],
    [rt2],
    [rt2_min],
    [rt2_min_lag],
    [rt] + (CASE WHEN [rt_min_lag] < 0 THEN -[rt_min_lag] ELSE 0 END) AS Balance1,
    [rt2] + (CASE WHEN [rt2_min_lag] < 0 THEN -[rt2_min_lag] ELSE 0 END) AS Balance2,
    [BalanceShouldBe]
FROM [c5]
ORDER BY
    [CategoryID],
    [Month]

Any help would be greatly appreciated!

Cheers,

EDIT: @dfundako found a couple of incorrect data entries, and I have updated the following:

  • 2018-01-01 Updated BalanceShouldBe FROM -6384.91 TO 20.00
  • 2018-02-01 Updated BalanceShouldBe FROM -75.43 TO -55.43

I have updated this in the main insert script also.

Albert
  • 1,062
  • 2
  • 13
  • 20
  • What's your expect result? – D-Shih Oct 04 '18 at 19:48
  • As I mentioned above, there is a column on the sample table called 'BalanceShouldBe' which is the column I'm trying to calculate. – Albert Oct 04 '18 at 19:51
  • I'm confused are there two tables BudgetTemp and BudgetTem2? – MikeAinOz Oct 05 '18 at 00:13
  • Hi @MikeAinOz, sorry to confuse you, I had a few [BGT].[BudgetTempXXX] tables that contains test data, so I just aliased it to [BudgetTemp] so I can switch sample tables without having to change the entire query each time. So to be clear, this query just runs of that single sample table I posted above. – Albert Oct 05 '18 at 01:18
  • Ok, I like what you're doing with LAG. My understanding is that the 'confined' category just runs a normal running total and that the 'affects buffer' category only brings forward positive values? – MikeAinOz Oct 06 '18 at 23:31
  • Hi Mike, That is correct. Affects buffer brings forward positive totals, but still shows a negative if it went negative for that month only. – Albert Oct 08 '18 at 00:02
  • @Albert For 2017-12-01, why would that not reset based on your third bullet point? For 2017-11-01, it would be -4627.03 and for 2017-12-01, it would be -1777.88 because it went back to affectsbuffer. – dfundako Oct 09 '18 at 19:29
  • @Albert Also, between 2017-12-01 and 2018-01-01, why is that not similar to 2016-02-01 and 2016-03-01? Both start with a negative bugdetandoutflow followed by a positive, yet one reset and the other doesn't. – dfundako Oct 09 '18 at 19:42
  • Hi @dfundako, For your first comment, it always shows that last negative from the confined in the affectsbuffer month, so that you can pay off the last remaining negative amount that month. So it resets after that month, not before that month. As for your second comment, I made a mistake, and I will update the data. It should be as follow: -- 2018-01-01 Updated BalanceShouldBe FROM -6384.91 TO 20.00 -- 2018-02-01 Updated BalanceShouldBe FROM -75.43 TO -55.43 I will updated the main question accordingly. – Albert Oct 09 '18 at 20:34

1 Answers1

1

You need to reset the running total in some cases based on the running total since the last reset.

This isn't something window functions are designed for.

The best performing method will likely be to use CLR to process the stream in order and output the running total with reset logic as needed (similar to this though you would need to add the logic to reset the totals). If the data is small you can use a recursive CTE.

WITH T
     AS (SELECT *,
                ROW_NUMBER()
                  OVER (
                    PARTITION BY CategoryID
                    ORDER BY Month) AS RN
         FROM   BGT.BudgetTemp2),
     R
     AS (SELECT *,
                Budgeted + Outflows AS BalanceCalculated
         FROM   T
         WHERE  RN = 1
         UNION ALL
         SELECT T.Month,
                T.CategoryID,
                T.Budgeted,
                T.Outflows,
                T.BudgetedAndOutflows,
                T.OverspendingHandling,
                T.BalanceShouldBe,
                T.RN,
                T.Budgeted + T.Outflows
                + IIF(R.OverspendingHandling = 'AffectsBuffer' AND R.BalanceCalculated < 0, 0, R.BalanceCalculated)
         FROM   T
                JOIN R
                  ON R.CategoryID = T.CategoryID
                     AND T.RN = R.RN + 1)
SELECT *
FROM   R; 

The above uses row number to find the next row per CategoryID. If all of the dates are guaranteed to be the first of the month and with no missing months you could use a join predicate of R.CategoryID = T.CategoryID AND T.Month = DATEADD(1, MONTH, R.Month) instead. Which will be more efficient - especially if you have a supporting index for this.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • Thanks for the answer, appreciate the effort. Definitely helped. Will have to see how performance goes when running it for multiple categories etc. I'm thinking that I may use this to populate some form of crunch table, which will be updated on saving data, instead of when trying to render the page. – Albert Oct 10 '18 at 19:58
  • @Albert - If you are able to use the `R.CategoryID = T.CategoryID AND T.Month = DATEADD(1, MONTH, R.Month)` approach and have a covering index with leading columns `Month, CategoryID` (or reverse) so the recursive part is able to perform a seek it should be reasonable performance – Martin Smith Oct 10 '18 at 20:01