I have a table t with:
PLACE | LOCATION | TS | ID | AMOUNT | GOING_IN | GOING_OUT |
---|---|---|---|---|---|---|
1 | 10 | 2020-10-01 | 1 | 100 | 10 | 0 |
1 | 10 | 2020-10-02 | 1 | 110 | 5 | -50 |
1 | 10 | 2020-10-03 | 1 | 75 | 0 | -100 |
1 | 10 | 2020-10-04 | 1 | -25 | 30 | 0 |
1 | 10 | 2020-10-05 | 1 | 5 | 0 | 0 |
1 | 10 | 2020-10-06 | 1 | 5 | 38 | -300 |
1 | 10 | 2020-10-07 | 1 | -257 | 0 | 0 |
1 | 10 | 2020-10-01 | 2 | 1 | 10 | 0 |
1 | 10 | 2020-10-02 | 2 | 11 | 0 | -12 |
1 | 10 | 2020-10-03 | 2 | -1 | 0 | -100 |
1 | 10 | 2020-10-04 | 2 | -101 | 0 | 0 |
2 | 20 | 2020-11-15 | 1 | 18 | 20 | 0 |
2 | 20 | 2020-11-16 | 1 | 38 | 0 | 0 |
2 | 20 | 2020-11-15 | 3 | -9 | 20 | -31 |
2 | 20 | 2020-11-16 | 3 | -20 | 0 | 0 |
So due to SAP legacy stuff some logistic data is mangled which may lead to negative inventory.
To check how severe the error is I need to count for each PLACE, LOCATION, ID
- the number of rows that have a positive AMOUNT AND which do not have a negative AMOUNT before
- the number of rows that have a negative AMOUNT AND any positive AMOUNT that has a negative AMOUNT anywhere before
As you can see in my table there are (for PLACE=1, LOCATION=10, ID=1) 3 rows with a positive AMOUNT without any negative AMOUNT before. But then there is a negative AMOUNT and some positive AMOUNTS afterwards --> those 4 rows should not be counted for COUNT_CORRECT but should count for COUNT_WRONG.
So in this example table my query should return:
PLACE | LOCATION | TOTAL | COUNT_CORRECT | COUNT_WRONG | RATIO |
---|---|---|---|---|---|
1 | 10 | 11 | 5 | 6 | 0.55 |
2 | 20 | 4 | 2 | 2 | 0.5 |
My code so far:
CREATE OR REPLACE TABLE ANALYTICS.t (
PLACE INT NOT NULL
, LOCATION INT NOT NULL
, TS DATE NOT NULL
, ID INT NOT NULL
, AMOUNT INT NOT NULL
, GOING_IN INT NOT NULL
, GOING_OUT INT NOT NULL
, PRIMARY KEY(PLACE, LOCATION, ID, TS)
);
INSERT INTO ANALYTICS.t
(PLACE, LOCATION, TS, ID, AMOUNT, GOING_IN, GOING_OUT)
VALUES
(1, 10, '2020-10-01', 1, 100, 10, 0)
, (1, 10, '2020-10-02', 1, 110, 5, -50)
, (1, 10, '2020-10-03', 1, 75, 0, -100)
, (1, 10, '2020-10-04', 1, -25, 30, 0)
, (1, 10, '2020-10-05', 1, 5, 0, 0)
, (1, 10, '2020-10-06', 1, 5, 38, 300)
, (1, 10, '2020-10-07', 1, -257, 0, 0)
, (1, 10, '2020-10-04', 2, 1, 10, 0)
, (1, 10, '2020-10-05', 2, 11, 0, -12)
, (1, 10, '2020-10-06', 2, -1, 0, -100)
, (1, 10, '2020-10-07', 2, -101, 0, 0)
, (2, 20, '2020-11-15', 1, 18, 12, 0)
, (2, 20, '2020-11-16', 1, 30, 0, 0)
, (2, 20, '2020-11-15', 3, -9, 20, -31)
, (2, 20, '2020-11-16', 3, -20, 0, 0)
;
Then
SELECT PLACE
, LOCATION
, SUM(CASE WHEN AMOUNT >= 0 THEN 1 ELSE 0 END) AS 'COUNT_CORRECT'
, SUM(CASE WHEN AMOUNT < 0 THEN 1 ELSE 0 END) AS 'COUNT_WRONG'
, ROUND((SUM(CASE WHEN AMOUNT < 0 THEN 1 ELSE 0 END) / COUNT(AMOUNT)) * 100, 2) AS 'ratio'
FROM t
GROUP BY PLACE, LOCATION
ORDER BY PLACE, LOCATION
;
But I don't know how I can filter for "AND which do not have a negative AMOUNT before" and counting by PLACE, LOCATION, ID as an intermediate step. Any help appreciated.