How can I create a Running Total Sumifs-like function in SQL?

Question

I'm pretty new to SQL, but Excel has become far too slow to continue working with, so I'm trying SQLiteStudio. I'm looking to create a column in a query showing running total over time (characterized as Schedule Points, marking each percent through a project's run time). Complete marks whether a Location has completed the install (Y/NULL), and is used simply to filter out incomplete locations from further calculations.

I currently have

With cte as(
Select [Location]
        ,[HW/NonHW]
        ,[Obligation/Actual]
        ,[Schedule Point]
        ,[CY20$]
        ,[Vendor Name]
        ,[Vendor Zip Code]
        ,[Complete]
        ,[System Rollup (Import)]
        ,IIf([Complete] = "Y", [CY20$], 0) As [Completed Costs]

FROM data)
Select [Location]
        ,[HW/NonHW]
        ,[Obligation/Actual]
        ,[Schedule Point]
        ,[CY20$]
        ,[Vendor Name]
        ,[Vendor Zip Code]
        ,[Complete]
        ,[System Rollup (Import)]
        ,[Completed Costs]
        ,SUM([Completed Costs]) OVER (PARTITION BY [Obligation/Actual], [Normalized Schedule Location 1%],[System Rollup (Import)], [HW/NonHW]) As [CY20$ Summed]

 FROM cte

At this point, what I'm looking to do is a sum not for each Schedule Point, but all prior Schedule Points (i.e. the <= operator in an Excel sumifs statement)

For reference, here is the sumifs I am trying to replicate:

=SUMIFS($N$2:$N$541790,$AU$2:$AU$541790,"Y",$AQ$2:$AQ$541790,AQ2,$AI$2:$AI$541790,AI2,$AH$2:$AH$541790,AH2,$AJ$2:$AJ$541790, "<=" & AJ2)

N is CY20$, AU is Complete, AQ is System, AI is Obligation/Actual, AH is HW/NonHW, AJ is Schedule Point.

Any help would be appreciated!

Lajos Arpad · Answer 1 · 2022-04-13T12:51:44.470

0

The equivalent to SUMIFS is a combination of SUM and CASE-WHEN in SQL.

Abstract example:

SELECT
    SUM(
        CASE
            WHEN <condition1> AND <condition2> AND <condition3> THEN 1
            ELSE 0
        END
    )
FROM yourtable

In the above, condition1, condition2 and condition3 are logical expressions, the < and > are just notifying that you have an expression there, it is not part of the syntax. It is also unnecessary to have exactly 3 conditions, you can have as many as you like. Also, it is unnecessary to use AND as the operator, you can construct your own expression as you like. The reason for which I have used the AND operator was that you intend to have a disjunction, presumably, based on the fact that you used SUMIFS.

A more concrete example:

CREATE TABLE person(
    number int,
    name   text,
    age    int
);

INSERT INTO person(number, name, age)
VALUES(1, 'Joe', 12);
INSERT INTO person(number, name, age)
VALUES(2, 'Jane' 12);
INSERT INTO person(number, name, age)
VALUES(3, 'Robert', 16);
INSERT INTO person(number, name, age)
VALUES(4, 'Roberta', 15);
INSERT INTO person(number, name, age)
VALUES(5, 'Blian', 18);
INSERT INTO person(number, name, age)
VALUES(6, 'Bigusdqs', 19);

SELECT 
    SUM(
        CASE
            WHEN age <= 16 AND name <> 'Joe' THEN 1
            ELSE 0
        END
    ) AS MySUMIFS
FROM person;

EDIT

If we are interested to know how many people have a smaller age than the current person, then we can do a join:

SELECT 
    SUM(
        CASE
            WHEN p2.age <= p1.age THEN 1
            ELSE 0
        END
    ) AS MySUMIFS, name
FROM person p1
JOIN person p2
ON p1.name <> p2.name
GROUP BY p1.name;

EDIT2

Created a Fiddle based on the ideas described above, you can reach it at https://dbfiddle.uk/?rdbms=sqlite_3.27&fiddle=3cb0232e5d669071a3aa5bb1df68dbca

The code in the fiddle:

CREATE TABLE person(
    number int,
    name   text,
    age    int
);

INSERT INTO person(number, name, age)
VALUES(1, 'Joe', 12);
INSERT INTO person(number, name, age)
VALUES(2, 'Jane' 12);
INSERT INTO person(number, name, age)
VALUES(3, 'Robert', 16);
INSERT INTO person(number, name, age)
VALUES(4, 'Roberta', 15);
INSERT INTO person(number, name, age)
VALUES(5, 'Blian', 18);
INSERT INTO person(number, name, age)
VALUES(6, 'Bigusdqs', 19);
SELECT 
    SUM(
        CASE
            WHEN p2.age <= p1.age THEN 1
            ELSE 0
        END
    ) AS MySUMIFS, p1.name
FROM person p1
JOIN person p2
ON p1.name <> p2.name
GROUP BY p1.name;

edited Apr 13 '22 at 12:51

answered Apr 12 '22 at 17:06

Lajos Arpad

64,414
37
100
175

I originally looked into using SUM and CASE-WHEN, but I wasn't able to figure out a way to make it relate to the values in each line. For instance, in your above example, rather than summing when age <= 16, I'd like to sum when age is less than or equal to whatever value age is for that given line. So for Joe, it'd be age <=12, but for Robert, it'd be age <=16, and for Blian, age <= 19. Is there a way to make that dynamic (is this the right word?) so that it changes for each line? – Alex M Apr 12 '22 at 17:38
@AlexM yes. Let me edit my answer. – Lajos Arpad Apr 12 '22 at 17:57
@AlexM edited my answer with a new query, where I join `person` with `person`. Since it's a self-join, I create an alias for both, named `p1` and `p2` respectively. The join condition is that the name is different, so I'm looking for the number of younger or equally young persons than the current person, exluding the current person. – Lajos Arpad Apr 12 '22 at 18:01
So this is closer, but I haven't been able to get it to commit. I want to say thank you for all the help you've provided so far. I ran up against the character limit, so I created a little visual and put it on imgur. https://imgur.com/a/RoSBx82 Again, thank you for helping so far, and if you have any recommendations or solutions, I really appreciate the help. – Alex M Apr 12 '22 at 19:52
@AlexM 1. You couldn't get it to commit. Do you have errors? If so, please edit the question and add the details. 2. You need every line. That's perfectly understandable, but we are pairing every person to every other person and without the aggregation we would get pairs like p1-p2, p1-p3, ..., p2-p1, p2-3..., ... To have a single line for each person and to have a line for every person, we need aggregation. Without the aggregation we have a 2-dimensional result, having n*(n-1) lines instead of n lines. With the aggregation we will only have n lines. If you would edit your question with – Lajos Arpad Apr 13 '22 at 08:30
@AlexM the structure of the table and your exact needs, I could help you further. Unfortunately I do not have an excel file to test your formula, so it would be time-consuming to build an excel file just to understand your formula. 3. You can add as many criteria as you need, the example was designed to be as simple as possible. 4. p1.name <> p2.name excludes pairs of p1-p1. You will still have a line for p1, but it will not be paired with itself. We can change the `ON` to `ON 1=1` to make sure that we pair each line with itself as well. 5. It is possible overall, but it is unclear to me – Lajos Arpad Apr 13 '22 at 08:35
@AlexM what the exact needs are. So, I would advise you to a.) List all the field names you have. b.) Tell me what the table name is. Is it `data`? 3. List the result fields you would like to have and the formula (mathematics or plain words are preferred) you need to calculate them. If you provide these details as an edit to your question, then it will be easier to implement a query that is closer to your needs. – Lajos Arpad Apr 13 '22 at 08:37
@AlexM tried my suggestion in a Fiddle. Apparently SQLite does not support the syntax of `insert(...)values(...),(...)...(...);`, so this probably caused difficulties to you while you were testing. Created a Fiddle for you where you can test the idea: https://dbfiddle.uk/?rdbms=sqlite_3.27&fiddle=3cb0232e5d669071a3aa5bb1df68dbca, please do not edit this Fiddle if you are to create a Fiddle and create a new Fiddle instead, so we can refer to this one without being worried about changes on it – Lajos Arpad Apr 13 '22 at 12:43

How can I create a Running Total Sumifs-like function in SQL?

1 Answers1