0

Supposing I have a table and query:

consisting of population for a given country in a given continent for a given year.
i want to return countries avg(population) & the continents avg population if the country's population is greater than the continents +3 Basically I want to filter out rows that are a certain difference away from the subtotal continent value.

I modified this and realize the data does not have multiple years and that the numbers are obviously garbage, but this is just an example.

 create table abc (continent varchar2(30), country varchar2(30), population number,   yr number)
 insert into abc values ('africa', 'kenya', 50, 2005)
 insert into abc values ('africa', 'egypt', 100, 2006)
 insert into abc values('africa', 'south africa', 35, 2007)
 insert into abc values ('africa', 'nigeria', 200, 2008)
 insert into abc values ('asia', 'china', 50, 2005)
 insert into abc values ('asia', 'india', 100, 2006)
 insert into abc values('asia', 'japan', 35, 2007) 
 insert into abc values ('asia', 'korea', 200, 2008)


 select continent, country, avg(population)
 from abc

 where ------population for each country > 3+ avg for each continent
 ----should return egpyt/nigeria rows and india/korea rows since average here is   96.25 for each continent.
 group by rollup(continent, country)
runningbirds
  • 6,235
  • 13
  • 55
  • 94
  • I don't want to suppose your table. Show us your table structure and expected/actual output. – CodeNewbie Sep 16 '14 at 05:25
  • And few insert statements too as sample data. – Lalit Kumar B Sep 16 '14 at 05:30
  • hi, i quickly added a few statements which i hope gets the idea across. i want to filter out rows that are a certain distance away from the subtotal row --- which are where the 2nd column in the select statement is NULL. – runningbirds Sep 16 '14 at 06:04
  • Lose the rollup and use `HAVING`: http://docs.oracle.com/database/121/SQLRF/statements_10002.htm#i2130020 – Rob van Wijk Sep 16 '14 at 07:15
  • i thought about having but i couldn't figure out how to make a relatively concise query for this with it – runningbirds Sep 16 '14 at 07:38
  • How do you define continent average? Average of all rows within a continent? Or average of the country averages? I mean, what if you insert another row for china with a population of 60 in 2006. The continent average for asia can then be average of 5 values (50, 60, 100, 35, 200) = 107.5 or it can be that first country average for china is calculated (average of 50 and 60 = 55) and then asia continent average is average of 4 values (55, 100, 35, 200) = 97.5. Which method should be used for continent average calculation? – Kim Berg Hansen Sep 16 '14 at 09:22
  • continent average is the average from rows that contain that continent (50+35+100+200)/4 = 96.25 – runningbirds Sep 16 '14 at 15:13

1 Answers1

1

So, with the definition of continent average as simply being the average of all rows with that continent, a solution can be:

select continent
     , country
     , avg(population) country_avg
     , max(continent_avg) continent_avg
  from (
   select continent
        , country
        , population
        , avg(population) over (
             partition by continent
          ) continent_avg
     from abc
  )
 group by continent, country
having avg(population) > max(continent_avg) + 3
 order by continent, country;

The reason I asked about the definition of continent average is, that if some countries within a continent have more rows in the table (=more years), those countries will weigh more in the average calculated like that. Then an alternative can be that the continent average is the average of the country averages, in which case a solution can be:

select *
  from (
   select continent
        , country
        , avg(population) country_avg
        , avg(avg(population)) over (
             partition by continent
          ) continent_avg
     from abc
    group by continent, country
  )
 where country_avg > continent_avg + 3;

If the countries all have the same number of years (same number of rows), the two solutions ought to give the same result. But if countries can have different number of years, you will have to pick the solution that fits your requirements.

Kim Berg Hansen
  • 1,979
  • 12
  • 12
  • #1 was what I was looking for thank you so much! also thank you for the 2nd solution, I think that helps me understand how to visualize & solve this sort of problem even better. – runningbirds Sep 18 '14 at 04:26