2

This task seems very simple and yet I still can't find an answer. I've seen solutions for weighted averages (i.e. means) in pivot tables but not weighted medians.

I need to create a pivot table with weighted medians (using two columns, one with the value and one with the weight–the quantity of each value to be calculated towards the median). In other words…

city           value    weight
Phoenix        100      40
Phoenix        140      12
Phoenix        500      15
San Francisco  700      5
San Francisco  300      5
San Francisco  350      1
San Francisco  600      1

will return…

city           weighted median
Phoenix        100
San Francisco  475

I've been trying to find an adequate solution in PostgreSQL and R but so far no luck. Any help appreciated!

Kasey
  • 173
  • 2
  • 11

1 Answers1

0

In R, you can try this:

require("plyr")

> ddply(df, ~city, summarise, median=median(rep(value, weight)))
           city median
1       Phoenix    100
2 San Francisco    475

Data:

df <- data.frame(city=c("Phoenix", "Phoenix", "Phoenix",
                        "San Francisco", "San Francisco", "San Francisco", "San Francisco"),
                 value=c(100, 140, 500, 700, 300, 350, 600),
                 weight=c(40, 12, 15, 5, 5, 1, 1))

Have a look at this SO article for more information on calculating weighted medians in R.

Community
  • 1
  • 1
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Thanks Tim. This seems like it should work yet I keep getting an `Error: invalid 'times' argument` message. Not familiar with this package. Both the value and weight are integers. – Kasey Feb 29 '16 at 06:43
  • I updated, please try again. – Tim Biegeleisen Feb 29 '16 at 07:06
  • same error message–it does work with the data frame you've created of course, so is it possible null values are throwing it off? – Kasey Feb 29 '16 at 07:17
  • I believe your data set has some problem, but I don't know what it is. If it really looked like the sample in your OP, there should be no error. – Tim Biegeleisen Feb 29 '16 at 07:21
  • 1
    Just need to remove null values. Thanks Tim! – Kasey Feb 29 '16 at 16:54