0

I have observations that are formed using Run Length Encoding transform

as Example

set.seed(1)
make_data <- function() {
  series <- rnorm(sample(10:50,1)) |> cumsum() |> sign()
  accelerometry::rle2(series,indices = T)
}

my_obs <- lapply(1:5,\(x) make_data())
names(my_obs) <- paste("observation", 1:length(my_obs))

..

my_obs


$`observation 1`
     value start stop length
[1,]    -1     1    3      3
[2,]     1     4    5      2
[3,]    -1     6    6      1
[4,]     1     7   30     24

$`observation 2`
     value start stop length
[1,]     1     1    5      5
[2,]    -1     6    8      3
[3,]     1     9   30     22

$`observation 3`
     value start stop length
[1,]     1     1   30     30

$`observation 4`
     value start stop length
[1,]    -1     1    1      1
[2,]     1     2   12     11
[3,]    -1    13   15      3
[4,]     1    16   30     15

$`observation 5`
     value start stop length
[1,]    -1     1    1      1
[2,]     1     2    9      8
[3,]    -1    10   30     21

How can I convert this data into a regular table view (dataset) where each observation is a row. How to do it correctly without losing information? enter image description here

Who does not know what Run Length Encoding is, please read

?rle
?accelerometry::rle2
desertnaut
  • 57,590
  • 26
  • 140
  • 166
mr.T
  • 181
  • 2
  • 13
  • It's not clear from the question what the output format should be. Does `dplyr::bind_rows(my_obs, .id="Observation")` get close to what you're looking for? – Miff Jul 14 '23 at 08:52
  • ...or are you looking for `inverse_rle2()` to reconstruct the data from the rle? – Miff Jul 14 '23 at 08:54
  • In your example, one observation is split into multiple rows, which is not what I want. I want to get a dataset where each row is one observation, but I need to somehow take into account the moment that each observation is of different sizes, as shown in my example – mr.T Jul 14 '23 at 08:57

1 Answers1

1

revised

library(tidyverse)

(step_1 <- imap_dfr(my_obs, data.frame, .id = "id") |> select(all_of(c(
  "id", "value", "start", "stop", "length"
))))

(step_2 <- step_1 |> pivot_longer(cols = -id) |> group_by(id) |>
  mutate(counter_id = cumsum(name == "value")))

(fin <- step_2 |>
  pivot_wider(
    id_cols = c("id"),
    names_from = c("name", "counter_id")
  ))
desertnaut
  • 57,590
  • 26
  • 140
  • 166
Nir Graham
  • 2,567
  • 2
  • 6
  • 10