0

For example I have the following list of calls:

  • Call#1 - duration 30 min
  • Call#2 - duration 43 min
  • Call#3 - duration 26 min
  • Call#4 - duration 35 min
  • Call#5 - duration 39 min

1) I need the SQL query retrieving the average duration of first 80% of calls.

Calls that fall into first 80% (according to call duration) are calls #3,#1,#4,#5. For these calls the average duration should be calculated ((26+30+35+39)/4=32,5). Calls above 80% (here call #2) should be ignored.

2) Also I need the vice versa query - what percentage of first calls will have the average call duration of 30 mins?

3) Hot to fetch the duration of 80%-th record (ordered according to call duration). E.g. if there's 500 records, what's the duration of 400th record?

How this SQL queries should look like (Oracle)?

sbrbot
  • 6,169
  • 6
  • 43
  • 74

1 Answers1

3

The NTILE() function splits a data-set into buckets; for the top 80% split into 5 and take the top 4:

select avg(duration)
  from ( select duration, ntile(5) over (order by duration) as bucket
           from ...
                )
 where bucket <= 4

If you're using Oracle 12c then the row limiting clause has had a ridiculously big functionality upgrade and you can get the actual percentage, something like:

select avg(duration)
  from ...
 order by duration
 fetch first 80 percent rows with ties

This selects the first 80% of rows in the order of the column DURATION ascending but where there are tied records accepting all of them. Use only instead of with ties to only return the specified percentage.

There's lots of options, which this blog post also explains pretty well.


For working out what percentage of calls will have the average call duration of 30 minutes you need to know the running average, the running count and the total number of rows in the table. For a running average the analytic AVG() should work and the analytic COUNT() for the total number of rows:

select max(running_count) / max(total_calls)
  from ( select duration
              , count(*) over () as total_calls
              , row_number() over (order by duration) as running_count
              , avg(duration) over (order by duration) as running_avg
           from ...
                )
 where running_avg <= 30
Ben
  • 51,770
  • 36
  • 127
  • 149
  • And what about the vice versa? How to calculate percentage of first calls which falls into given average duration? This can't be achieved using NTILE function. – sbrbot Mar 24 '15 at 09:50
  • Thanks @Ben a lot. I accepted your answer as complete but may I ask yet another thing (if you can append to previous answer) - how to fetch duration od 80%-th record (e.g. If there's 500 records how to get duration of 400th record). – sbrbot Mar 25 '15 at 12:34
  • You should be able to extend my answer to achieve that @sbrbot; `row_number()` gives the unique row number over a partition in the stated order, so: `select duration from ( select a.*, row_number() over (order by duration) as rnum from ) where rnum = 400` . – Ben Mar 25 '15 at 12:50
  • Yes if you know in advance that there's 500 records and that 400th row is 80%th. But it can vary. I need sth like WHERE rnum=0.8*COUNT(*) – sbrbot Mar 25 '15 at 12:52
  • Ah, I understand. Just add `max(duration)` to the select; as you're selecting the top 80% the 80%th record is the maximum. – Ben Mar 25 '15 at 12:56