1

I have accelerometer data with each row labelled for the behaviour an animal is displaying per second, simple example:

Time X_accel Behaviour
1 0.01 Standing
2 0.01 Standing
3 0.01 Standing
4 0.02 Standing
5 0.06 Walking
6 0.07 Walking
7 0.01 Standing
8 0.02 Standing

I've got a rolling window of functions being applied per behaviour - but I want them to be applied to each separate behaviour event rather than all of the data associated with for example "standing" grouped together.

Is it possible to make it recognise the end of each event and then start again at the beginning of the next one?

Alternatively I have considered whether it might be possible to add a 1 to "standing" for the first event (standing1) 2 to the second and so on throughout the data to make each event separate, although I'm not sure how I'd modify the code for it to recognise that and cycle through however many individual "standing" events there are.

lst1 <- lapply(df[df$Behaviour == behaviour, c(2)], 
    \(x) rollapply(x, FUN = time_domain_summary, 
                      width = window.size, by = window.step, 
                      align = c("left"), partial = FALSE))

where: "behaviour" is defined as standing, time_domain_summary = the features to calculate (mean, median etc.), window.size & window.step are defined e.g. size of 2 and step of 0.1

Currently the output is a rolling window across ALL rows containing "standing" in the behaviour column as if they were all one long event BUT I'd like them to be applied per individual event if possible.

kjetil b halvorsen
  • 1,206
  • 2
  • 18
  • 28
  • Welcome to SO. You maximize your chance of getting a useful answer if you provide a minimal reproducible example. This [post](https://stackoverflow.com/help/minimal-reproducible-example)  may help. – yuliaUU Mar 25 '23 at 01:29

1 Answers1

0

Use a cumsum trick to create a grouping variable on Behaviour and one of the group calculations functions, such as by or ave. Below, as an example, I apply the function mean.

library(zoo)
#> 
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#> 
#>     as.Date, as.Date.numeric

window.size <- 2L
f <- cumsum(c(0, df1$Behaviour[-1] != df1$Behaviour[-nrow(df1)]))

by(df1$X_accel, f, \(x, fill) {
  rollapply(x, window.size, mean, fill = fill, align = "left", partial = FALSE)
}, fill = NA)
#> f: 0
#> [1] 0.010 0.010 0.015    NA
#> ------------------------------------------------------------ 
#> f: 1
#> [1] 0.065    NA
#> ------------------------------------------------------------ 
#> f: 2
#> [1] 0.015    NA

ave(df1$X_accel, f, FUN = \(x) {
  rollapply(x, width = 2L, FUN = mean, fill = NA, align = "left", partial = FALSE)
})
#> [1] 0.010 0.010 0.015    NA 0.065    NA 0.015    NA

Created on 2023-03-24 with reprex v2.0.2

Edit

The ave code above can be rewritten as a function with some extra code to output a data.frame with the column name as asked for. The function is called with the wanted stat, in the examples below mean and median. The output can then be cbinded with the input data.

fun <- function(x, stat, col = "X_accel", group = "Behaviour", window.size = 2L, fill = NA, align = "left", partial = FALSE) {
  fun_name <- deparse(substitute(stat))
  f <- cumsum(c(0, x[[group]][-1] != x[[group]][-nrow(x)]))
  y <- ave(x[[col]], f, FUN = \(x) {
    rollapply(x, width = window.size, 
              FUN = stat, 
              fill = fill, align = align, partial = partial)
  })
  new_col_name <- paste(col, fun_name, sep = "_")
  setNames(data.frame(y), new_col_name)
}

fun(df1, mean)
#>   X_accel_mean
#> 1        0.010
#> 2        0.010
#> 3        0.015
#> 4           NA
#> 5        0.065
#> 6           NA
#> 7        0.015
#> 8           NA
fun(df1, median)
#>   X_accel_median
#> 1          0.010
#> 2          0.010
#> 3          0.015
#> 4             NA
#> 5          0.065
#> 6             NA
#> 7          0.015
#> 8             NA

Created on 2023-03-29 with reprex v2.0.2

Data

df1 <- read.table(text = "
Time    X_accel     Behaviour
1   0.01    Standing
2   0.01    Standing
3   0.01    Standing
4   0.02    Standing
5   0.06    Walking
6   0.07    Walking
7   0.01    Standing
8   0.02    Standing",
header = TRUE)

Created on 2023-03-24 with reprex v2.0.2

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • Thankyou! This has nearly worked for what I want. The only issue is that I'd like to keep the column names from the data frame so for example: x_accel_mean, x_accel_median and so on (rather than [,1] [,2] that appears is you apply this to more than one column. – mari davies Mar 29 '23 at 11:24
  • @maridavies See now after the edit. I believe that it solves your problem. – Rui Barradas Mar 29 '23 at 15:06