2

I'm working with invoice data in Oracle SQL. I want to create a variable, "median account invoice amount to date", that gives the median invoice amount for an account up to that given invoice date.

SQLQueryR
  • 41
  • 3

2 Answers2

2

Oracle actually has a median function. To find the median of all accounts for every day:

select  median(amount)
from    invoices

Or to show the median per account per day:

select  AccountNr
,       to_char(InvoiceDate, 'yyyymmdd')
,       median(amount)
from    invoices
group by
        AccountNr
,       to_char(InvoiceDate, 'yyyymmdd')

Or to show the "running median":

select  i1.custID
,       i1.inv_date
,       median(i2.amount)
from    invoices i1
join    invoices i2
on      i2.custId = i1.custID
        and i2.inv_date <= i1.inv_date
group by
        i1.custID
,       i1.inv_date
Andomar
  • 232,371
  • 49
  • 380
  • 404
  • Thanks. Not quite what I am after. For every account and for every date, what is the median invoice amount? – SQLQueryR Aug 28 '11 at 09:27
  • Median(amount) would be the amount on that day. What is the median amount for all amounts prior to the day? Ie median to date? Can this be achieved using analytic functions, like, for example, average-to-date? avg(inv_amount) over (partition by supertest.custID order by inv_date But replacing median gives an error – SQLQueryR Aug 28 '11 at 09:39
  • @OracleQuery: Strange, looks like `median` doesn't support a running sum like `avg` does. Answer edited to calculate the median with a self-join. – Andomar Aug 28 '11 at 09:57
  • Is there any way to acheive this using analytic functions rather than a join ...for interests of efficiency – SQLQueryR Aug 28 '11 at 10:01
  • According to [Oracle](http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/functions086.htm) you can use median as an analytic function. – Ben Aug 28 '11 at 18:18
  • 1
    @Ben: Yes, but unlike `avg`, `median` does not allow an `order by` clause, so you can't use it for a running median – Andomar Aug 28 '11 at 18:21
0

From what I understand this is what you're after. @Andomar, as Median is the middle in a series of numbers ordering this series makes no difference to the result. The middle number between the highest and the lowest cannot change...

So, median by data and account would be:

select acc_no, median(amount) over ( partition by acc_no, to_char(invoice_date,'yyyymmdd'))
  from invoices
       -- if applicable
 where invoice_date < :date

median by account would be

select acc_no, median(amount) over ( partition by acc_no )
 from invoices
where invoice_date < :date
Ben
  • 51,770
  • 36
  • 127
  • 149
  • Ordering the series does not matter for the average either. What order by does for `avg() over (order by...)` is that it calculates a running average over all values before the current one. – Andomar Aug 28 '11 at 19:47
  • @Andomar, thank you for clarifying. I that's very useful to know. – Ben Aug 28 '11 at 19:49