0

I have the following data set:

df <- data.frame(
  C      = c(1,2,3,1,2,3,1,2,3,1),
  weight = c(1,1.5,2,2,1.5,1,2,1,1.5,2.5),
  time   = c(15,20,30,45,60,15,20,30,45,60)
)

I need to aggregate the data by the variable C in order to find the median time for each C. Each observation is weighted by the variable 'weight'.

Is there a way to replace 'mean' by a weighted median in the following code ?

output<-aggregate(.~C, data=df, mean, na.rm=TRUE)
Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
user2568648
  • 3,001
  • 8
  • 35
  • 52

1 Answers1

1

There is a weighted median function in the bigvis package on github.

library(devtools)
install_github("bigvis")

aggregate doesn't work with functions that need multiple vector inputs. Use ddply from plyr instead.

library(plyr)
ddply(df, .(C), summarise, wm = weighted.median(time, weight))
Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
  • When trying to install bigvis I get the following error: Error in function (type, msg, asError = TRUE) : Could not resolve host: github.com; Host not found – user2568648 Jan 23 '14 at 14:49
  • @user2568648 Are you on a corporate network? If so, the most likely explanation is that access to github is blocked by your network admins. Try and reach the site in a browser. – Richie Cotton Jan 23 '14 at 15:20