5

I already have a working example which does exactly what I need. Now the problem is, that I'm not really a fan of subqueries and I think there could be a better solution to this problem.

So here is my (already) working example:

with t as
(
select  'Group1' as maingroup,'Name 1' as subgroup, 'random' as random, 500 as subgroupbudget from dual
union all
select 'Group1','Name 1','random2',500 from dual
union all
select 'Group1','Name 2','random3', 500 from dual
union all
select 'Group2','Name 3','random4', 500 from dual
union all
select 'Group2','Name 4','random5',500 from dual
union all
select 'Group2','Name 5', 'random6',500 from dual
)
select
maingroup,
subgroup,
random,
(select distinct sum(subgroupbudget) over(partition by maingroup) from t b where a.maingroup=b.maingroup group by maingroup,subgroup,subgroupbudget) groupbudget
from t a
group by  maingroup, subgroup ,subgroupbudget, random
order by maingroup, subgroup

As you can see, the with-clause shows a simplified table with data. Now the problem is that the last column is the budget of the subgroup. In the result I need the budget of the maingroup. That means I have to sum all values within the maingroup, but only if the subgroups are different (Here I need some kind of distinct).

Unfortunately a simple

sum(distinct subgroupbudget) over(partition by maingroup)

won't work because the numbers (subgroupbudget) can be the same (like in the example)

ekad
  • 14,436
  • 26
  • 44
  • 46
  • 1
    Thanks for providing the sample data, we don't see that usually. +1 for that. Could you please add your desired output too? – Lalit Kumar B May 11 '15 at 12:58
  • What would be the sum of Group1 if the subgroupbudget was different for 'random' and 'random1'? – Boneist May 11 '15 at 12:59
  • Hi, thanks for the answer. The output is already what I want but the SQL is not. As I said I'm not a fan of subqueries and I thought maybe there is a better way to solve this without a subquery. – user3356355 May 11 '15 at 13:04
  • @Boneist: The budget of random and random2 can not be different because the subgroupbudget refers to the column subgroup (Both 'Name 1') – user3356355 May 11 '15 at 13:08
  • Do you mean that in your actual query, you get that from some other table based on the subgroup, so that the values are only duplicated across your final query? Or are you saying that your data model is not normalised, and you're duplicating information? Because if the latter, how do you guarantee that the values aren't different across the rows? And if the former, you could probably rewrite the query to do the sum before you populate the rows with the random values. – Boneist May 11 '15 at 13:13
  • @Boneist: This is just an example. The real table is actually a very complex view. – user3356355 May 11 '15 at 13:19

1 Answers1

3

Assuming that for a maingroup/subgroup, the subgroupbudget is always the same (or you only take the highest value for the subgroup), this should work:

with t as (select  'Group1' as maingroup,'Name 1' as subgroup, 'random' as random, 500 as subgroupbudget from dual
           union all
           select 'Group1','Name 1','random2',500 from dual
           union all
           select 'Group1','Name 2','random3', 500 from dual
           union all
           select 'Group2','Name 3','random4', 500 from dual
           union all
           select 'Group2','Name 4','random5',500 from dual
           union all
           select 'Group2','Name 5', 'random6',500 from dual),
    t1 as (select maingroup,
                  subgroup,
                  random,
                  case when row_number() over (partition by maingroup, subgroup order by subgroupbudget desc) = 1 then subgroupbudget
                  end subgroupbudget
           from t)
select maingroup,
       subgroup,
       random,
       sum(subgroupbudget) over (partition by maingroup) groupbudget
from   t1;

MAINGROUP SUBGROUP RANDOM  GROUPBUDGET
--------- -------- ------- -----------
Group1    Name 1   random         1000
Group1    Name 1   random2        1000
Group1    Name 2   random3        1000
Group2    Name 3   random4        1500
Group2    Name 4   random5        1500
Group2    Name 5   random6        1500

It is effectively saying that for a maingroup/subgroup, you only want to use one of the values (the highest) of the rows in that subgroup in the sum.

Whether it's "better" (i.e. more performant) than your original query is something that you would have to test. Sub-queries are not necessarily a bad thing; they are a tool, and sometimes they're the right tool to use.

Boneist
  • 22,910
  • 1
  • 25
  • 40
  • Hi Boneist, very interesting! I don't thougt of that, but setting the (subgroup)budget only once per subgroup depending on the rownumber is almost genius and the perfect 'solution' to my problem. Thank you very much! – user3356355 May 11 '15 at 13:16