2

I am trying to calculate totals for parent nodes of a tree structure, and for some reason, calculating the parent node(s) total value is eluding me.

Assume I have three tables

Regions

GroupID     ParentID      Name
1           null          NorthAmerica
2           null          Asia
3           null          Europe
4           1             NorthEast
5           1             WestCoast
6           3             UK
7           3             Germany
8           2             Hong Kong
9           2             Japan

RegionMember

GroupID     EmpID   
4           10000   
4           10001   
5           10011   
6           20455   
6           10003   
7           34567   
9           43589   
9           54890   
8           84320   
8           84560   

EmployeeSales

EmployeeID     Name     Sales ($)
10000          Joe      $ 150,000.00 
10001          Mary     $ 200,000.00 
10011          John     $ 175,000.00 
20455          Fred     $ 100,000.00 
10003          Bill     $ 250,000.00 
34567          Abe      $ 142,000.00 
43589          Jack     $ 260,000.00 
54890          Amanda   $ 300,000.00 
84320          Jane     $ 15,000.00 
84560          Oscar    $ 175,000.00 

The goal is to query for different levels within the tree and see the totals for those regions.
For example, one view would show top regions with sales totals:

NorthAmerica    525,000.00 *(The sum of NorthEast and WestCoast)*
Asia            750,000.00 *(The sum of Hong Kong and Japan)*
Europe          492,000.00 *(The sum of UK and Germany)*

Another view would show regional totals (focused for a single parent):

NorthAmerica    525,000.00 *(Total of the region members NorthEast and WestCoast)*
Northeast       350,000.00 *(Total of NorthEast Leaves Joe and Mary)*
WestCoast       175,000.00 *(Total of WestCoast Leaves John)*

Of course, these trees could go much deeper in terms of branches, but I think the example illustrates the problem I’m fighting.

So far, using CTE’s I can fairly easily navigate the tree structures, and I can get totals for the final branches (or leaves of the branch), but I can’t seem to get the rolled-up totals to come through.

So from the above example, I can get the following output:

NorthAmerica    NULL
NorthEast       350,000.00
WestCoast       175,000.00

I would provide the existing code, but the actual tables and the number of joins are much different in my actual table, and might just confuse the overall goal. However, this question is similar to what I am looking to accomplish, but it doesn't seem to totally fit the fill:

CTE Sum of Child Levels

Any help is greatly appreciated.

Build Scripts Follow:

create table Regions 
(
GroupID int, 
ParentID int,
Name Varchar(40)
)

create table RegionMember
(
GroupID int,
empid int
)

Create Table EmployeeSales
(
EmployeeID int,
Name Varchar(50),
Sales float,
)

Insert into Regions Values
(1, null, 'NorthAmerica'),
(2, null, 'Asia'),
(3, null, 'Europe'),
(4, 1, 'NorthEast'),
(5, 1, 'WestCoast'),
(6, 3, 'UK'),
(7, 3, 'Germany'),
(8, 2, 'Hong Kong'),
(9, 2, 'Japan');

Insert into RegionMember Values
(4, 10000),   
(4, 10001),   
(5, 10011),  
(6, 20455),   
(6, 10003),
(7, 34567),   
(9, 43589),   
(9, 54890),   
(8, 84320),   
(8, 84560);

Insert into EmployeeSales Values
(10000, 'Joe', 150000), 
(10001, 'Mary', 200000), 
(10011, 'John', 175000), 
(20455, 'Fred', 100000),
(10003, 'Bill', 250000),
(34567, 'Abe', 142000), 
(43589, 'Jack', 260000), 
(54890, 'Amanda', 300000), 
(84320, 'Jane', 15000), 
(84560, 'Oscar', 175000); 

Also have started a SQL Fiddle with the above: http://sqlfiddle.com/#!6/4ee0c/1

Community
  • 1
  • 1
Buckwheattb
  • 181
  • 1
  • 2
  • 12

1 Answers1

2

I added few rows to the sample data, because original is too simple. This has three levels.

Insert into Regions Values
(10, null, 'A1'),
(40, 10, 'B1'),
(50, 10, 'B2'),
(60, 10, 'B3'),
(70, 40, 'C1'),
(80, 40, 'C2');

Insert into RegionMember Values
(40, 104),
(50, 105),
(60, 106),
(70, 107),
(80, 108);

Insert into EmployeeSales Values
(104, '104', 104),
(105, '105', 105),
(106, '106', 106),
(107, '107', 107),
(108, '108', 108);

Top regions

This query is straight-forward recursive CTE, which starts from the highest level (WHERE ParentID IS NULL) and sums up all of its children. The "trick" here is to include the original StartID and StartName of the group as we traverse the tree, so we could GROUP BY them at the end.

WITH
CTE
AS
(
    SELECT
        Regions.GroupID AS StartID
        ,Regions.Name AS StartName
        ,Regions.GroupID
        ,Regions.ParentID
        ,Regions.Name
        ,1 AS Lvl
    FROM Regions
    WHERE ParentID IS NULL

    UNION ALL

    SELECT
        CTE.StartID
        ,CTE.StartName
        ,Regions.GroupID
        ,Regions.ParentID
        ,Regions.Name
        ,CTE.Lvl + 1 AS Lvl
    FROM
        Regions
        INNER JOIN CTE ON CTE.GroupID = Regions.ParentID
)
SELECT
    CTE.StartID
    ,CTE.StartName
    ,SUM(EmployeeSales.Sales) AS SumSales
FROM
    CTE
    INNER JOIN RegionMember ON RegionMember.GroupID = CTE.GroupID
    INNER JOIN EmployeeSales ON EmployeeSales.EmployeeID = RegionMember.empid
GROUP BY
    CTE.StartID
    ,CTE.StartName
ORDER BY
    CTE.StartID;

Run the query step-by-step to understand how it works.

Result

+---------+--------------+----------+
| StartID |  StartName   | SumSales |
+---------+--------------+----------+
|       1 | NorthAmerica |   525000 |
|       2 | Asia         |   750000 |
|       3 | Europe       |   492000 |
|      10 | A1           |      530 |
+---------+--------------+----------+

Regional totals and subtotals

The second query is not so easy. The first part CTE_Groups is very similar to the previous query, but with the filter for specific starting GroupID. CTE_Sums calculates summary of sales for the starting Group and each of its children. CTE_Totals again recursively goes through results of CTE_Sums and repeats child rows as needed to get the total for each Group including children summary.

Again, run the query step-by-step, CTE-by-CTE to understand how it works. Not all columns are used in the final result, but they help understand what is going on during intermediary steps.

WITH
CTE_Groups
AS
(
    SELECT
        Regions.GroupID AS StartID
        ,Regions.Name AS StartName
        ,Regions.GroupID
        ,Regions.ParentID
        ,Regions.Name
        ,1 AS Lvl
    FROM Regions
    WHERE Regions.GroupID = 1 -- North America
    --WHERE Regions.GroupID = 10

    UNION ALL

    SELECT
        CTE_Groups.StartID
        ,CTE_Groups.StartName
        ,Regions.GroupID
        ,Regions.ParentID
        ,Regions.Name
        ,CTE_Groups.Lvl + 1 AS Lvl
    FROM
        Regions
        INNER JOIN CTE_Groups ON CTE_Groups.GroupID = Regions.ParentID
)
,CTE_Sums
AS
(
    SELECT
        CTE_Groups.GroupID
        ,CTE_Groups.ParentID
        ,CTE_Groups.Name
        ,SUM(EmployeeSales.Sales) AS SumSales
    FROM
        CTE_Groups
        LEFT JOIN RegionMember ON RegionMember.GroupID = CTE_Groups.GroupID
        LEFT JOIN EmployeeSales ON EmployeeSales.EmployeeID = RegionMember.empid
    GROUP BY
        CTE_Groups.GroupID
        ,CTE_Groups.ParentID
        ,CTE_Groups.Name
)
,CTE_Totals
AS
(
    SELECT
        CTE_Sums.GroupID AS StartID
        ,CTE_Sums.Name AS StartName
        ,CTE_Sums.GroupID
        ,CTE_Sums.ParentID
        ,CTE_Sums.Name
        ,CTE_Sums.SumSales
        ,1 AS Lvl
    FROM CTE_Sums

    UNION ALL

    SELECT
        CTE_Totals.StartID
        ,CTE_Totals.StartName
        ,CTE_Sums.GroupID
        ,CTE_Sums.ParentID
        ,CTE_Sums.Name
        ,CTE_Totals.SumSales
        ,CTE_Totals.Lvl + 1 AS Lvl
    FROM
        CTE_Sums
        INNER JOIN CTE_Totals ON CTE_Totals.ParentID = CTE_Sums.GroupID
)
SELECT
    GroupID
    ,Name
    ,SUM(SumSales) AS SumTotal
FROM CTE_Totals
GROUP BY
    GroupID
    ,Name
ORDER BY
    GroupID
    ,Name
;

Result for GroupID = 1

+---------+--------------+----------+
| GroupID |     Name     | SumTotal |
+---------+--------------+----------+
|       1 | NorthAmerica |   525000 |
|       4 | NorthEast    |   350000 |
|       5 | WestCoast    |   175000 |
+---------+--------------+----------+

Result for GroupID = 10

+---------+------+----------+
| GroupID | Name | SumTotal |
+---------+------+----------+
|      10 | A1   |      530 |
|      40 | B1   |      319 |
|      50 | B2   |      105 |
|      60 | B3   |      106 |
|      70 | C1   |      107 |
|      80 | C2   |      108 |
+---------+------+----------+
Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
  • That second part is exactly what I was looking for...thank you very much for the help on this....it was driving me nuts. – Buckwheattb May 24 '16 at 11:49