-1

Basically I want to do a rolling average of the lowest N values.

I have tried using

   mydata$lowest_n = rollapply(values, 5, mean(sort[1:5]), align=c('right'))

But I cannot get this to work. again this needs to be rolling as this is a time series data set. The above code I know has an obvious error, I just don't have my attempted methods in front of me. Any advice is appreciated!!!!

If it matters, I have this scenario for many different groups, which are already grouped using ddply()

my data =

  structure(list(Date = structure(c(13586, 13587, 13594, 13635, 
  13656, 13657, 13686, 13710, 13712, 13718, 13726, 13753, 13783, 
  13791, 13874, 13895, 13910, 13917, 13923, 13930, 13958, 13970, 
  13977, 13978, 13991, 14018, 14021, 14066, 14070, 14073, 14104, 
  14112, 14118, 14138, 14220, 14269, 14293, 14473, 14631, 13566, 
  13692, 13916, 14084, 12677, 12984, 13035, 13222, 13406, 13417, 
  13483, 13539, 13580, 13607, 13644, 13693, 13698, 13713, 13714, 
  13726, 13727, 13750, 13754, 13777, 13809, 13810, 13812, 13819, 
  13832, 13853, 13893, 13944, 13954, 14015, 14021, 14050, 14051, 
  14092, 14104, 14119, 14134, 14209, 14218, 14267, 14302, 14309, 
  14334, 14337, 14379, 14391, 14428, 14449, 14475, 14525, 14546, 
  14552, 14579, 14589, 12545, 12587, 12693), class = "Date"), value = c(15, 
  27, 15, 25, 16, 22, 27, 23, 16, 19, 22, 21, 15, 20, 22, 28, 22, 
  27, 20, 25, 28, 16, 16, 28, 24, 28, 22, 28, 22, 14, 28, 24, 16, 
  15, 28, 22, 28, 28, 27, 19, 20, 19, 24, 19, 25, 22, 24, 16, 28, 
  19, 18, 20, 20, 21, 19, 20, 22, 21, 20, 21, 23, 24, 17, 19, 28, 
  24, 30, 20, 20, 18, 21, 15, 16, 26, 19, 20, 19, 17, 20, 16, 18, 
  29, 21, 23, 18, 18, 26, 26, 25, 13, 13, 15, 18, 17, 20, 15, 18, 
  23, 29, 21)), .Names = c("Date", "value"), row.names = c(NA, 
 100L), class = "data.frame")
runningbirds
  • 6,235
  • 13
  • 55
  • 94

1 Answers1

0

The code below gets you part of the way there: for your 100 row data.frame, this creates vector of 96 observations using the zoo::rollapply() function. It's not clear what you want to do with the first four observations, so I'm ignoring that part for now, but the syntax could probably be expanded, or you could write a more detailed function to pass to FUN.

mydata <- structure(list(Date = structure(c(13586, 13587, 13594, 13635, 
                                            13656, 13657, 13686, 13710, 13712, 13718, 13726, 13753, 13783, 
                                            13791, 13874, 13895, 13910, 13917, 13923, 13930, 13958, 13970, 
                                            13977, 13978, 13991, 14018, 14021, 14066, 14070, 14073, 14104, 
                                            14112, 14118, 14138, 14220, 14269, 14293, 14473, 14631, 13566, 
                                            13692, 13916, 14084, 12677, 12984, 13035, 13222, 13406, 13417, 
                                            13483, 13539, 13580, 13607, 13644, 13693, 13698, 13713, 13714, 
                                            13726, 13727, 13750, 13754, 13777, 13809, 13810, 13812, 13819, 
                                            13832, 13853, 13893, 13944, 13954, 14015, 14021, 14050, 14051, 
                                            14092, 14104, 14119, 14134, 14209, 14218, 14267, 14302, 14309, 
                                            14334, 14337, 14379, 14391, 14428, 14449, 14475, 14525, 14546, 
                                            14552, 14579, 14589, 12545, 12587, 12693), class = "Date"), 
                         value = c(15, 27, 15, 25, 16, 22, 27, 23, 16, 19, 22, 21, 15, 20, 22, 28, 22, 
                                   27, 20, 25, 28, 16, 16, 28, 24, 28, 22, 28, 22, 14, 28, 24, 16, 
                                   15, 28, 22, 28, 28, 27, 19, 20, 19, 24, 19, 25, 22, 24, 16, 28, 
                                   19, 18, 20, 20, 21, 19, 20, 22, 21, 20, 21, 23, 24, 17, 19, 28, 
                                   24, 30, 20, 20, 18, 21, 15, 16, 26, 19, 20, 19, 17, 20, 16, 18, 
                                   29, 21, 23, 18, 18, 26, 26, 25, 13, 13, 15, 18, 17, 20, 15, 18, 
                                   23, 29, 21)), 
                    .Names = c("Date", "value"), row.names = c(NA, 100L), class = "data.frame")
lowest_n <-  zoo::rollapply(mydata$value, width=5, FUN=function(x) mean(sort(x)[1:5], align=c('right')))
Nicholas G Reich
  • 1,028
  • 10
  • 21
  • I had made a mistake in the function. Does it work now? Looks like adding `partial=TRUE` might get it so you could cbind this to the data.frame you have... – Nicholas G Reich Jan 18 '16 at 00:14
  • No, it does not work since the partial=T is outside of the call to sort 1:5 I believe. I need a partial 1:5 and that is the part I am stuck coding. For example, if there were only 3 rows it should return the average of those 3. – runningbirds Jan 19 '16 at 23:06