9

How can I split the following vector to list with every new element whereever an empty '' value is encountered?

For e.g. Given the following input-

x <- c("abc", "", "a", "b", "c", "", "ab", "ac", "", "a", "a", "a", 
"a", "", "b")

x
 [1] "abc" ""    "a"   "b"   "c"   ""    "ab"  "ac"  ""    "a"   "a"   "a"   "a"   ""    "b"  

I want the following list as output

list("abc", c("a", "b", "c"), c("ab", "ac"), c("a", "a", "a", 
"a"), "b")

[[1]]
[1] "abc"

[[2]]
[1] "a" "b" "c"

[[3]]
[1] "ab" "ac"

[[4]]
[1] "a" "a" "a" "a"

[[5]]
[1] "b"
AnilGoyal
  • 25,297
  • 4
  • 27
  • 45

2 Answers2

10

Create a logical vector based on the blank elements (i1), get the cumulative sum on the logical vector to create group for splitting the subset of vector (i.e. without the blanks)

i1 <- !nzchar(x)
unname(split(x[!i1], cumsum(i1)[!i1]))

-output

[[1]]
[1] "abc"

[[2]]
[1] "a" "b" "c"

[[3]]
[1] "ab" "ac"

[[4]]
[1] "a" "a" "a" "a"

[[5]]
[1] "b"
akrun
  • 874,273
  • 37
  • 540
  • 662
  • @akrun. Could you please check my answer. Why do I get one element per list instead of multiple! Thank you akrun! – TarJae Feb 15 '22 at 20:27
  • 2
    @TarJae I think you need to wrap in `list` instead of `paste` in the `summarise` i.e. `summarise(value1 =list(value)) %>% pull(value1)` because `paste` with `collapse` returns a single string element by collapsing with space. Thus the `as.list` is not doing anything because it doesn't do any split on the substring – akrun Feb 15 '22 at 20:31
3

Update with the valuable hint of akrun:

library(dplyr)
library(tibble)
  
  as_tibble(x) %>%
    mutate(value = na_if(value, '')) %>% 
    group_by(id_Group =cumsum(is.na(value))+1) %>% 
    na.omit() %>% 
    summarise(value1 =list(value)) %>% 
    pull(value1)

[[1]]
[1] "abc"

[[2]]
[1] "a" "b" "c"

[[3]]
[1] "ab" "ac"

[[4]]
[1] "a" "a" "a" "a"

[[5]]
[1] "b"

First answer: Here is another approach:

library(dplyr)
library(tibble)

as_tibble(x) %>%
  mutate(value = na_if(value, '')) %>% 
  group_by(id_Group =cumsum(is.na(value))+1) %>% 
  na.omit() %>% 
  summarise(value1 = paste(value, collapse = " ")) %>% 
  pull(value1) %>% 
  as.list(value1)
[[1]]
[1] "abc"

[[2]]
[1] "a b c"

[[3]]
[1] "ab ac"

[[4]]
[1] "a a a a"

[[5]]
[1] "b"
TarJae
  • 72,363
  • 6
  • 19
  • 66