6

Introduction

I have two nested lists with the same structure that I'd like to combine (in the c() sense).

There might already exist a concept for what I mean by same structure in graph theory, or in computer science, for this relationship but I am not aware.

So here is my attempt to clarify what I mean by same structure:

  • Elements of a list at some level are either all named or none is named;
  • When we have named elements there are never duplicated names at that level;
  • Parent-child node relationships are the same for the two lists, when the nodes are named elements themselves.

So I am wondering if there is already a solution for this problem which I feel might be rather general and common...(?) Any solution involving:

  • Using base rapply;
  • Tidyverse solution with some combination of purrr functions;
  • Functions from the rlist package

would be great!

Example

foo and bar are two example lists with same structure.

wonderful is the desired list that results from combining foo and bar (done manually).

I hope it is clear enough!

# Input lists: foo and bar
foo <- list(a = list(a1 = 1:3, a2 = rep('a', 3)), b = list(b1 = list(b11 = c(4,5,6), b12 = rep('b', 3)), b2 = list(b21 = list(b31 = c(0, 1, 2)))), c = list(list(c21 = 1:3), list(c21 = 4:6), list(c21 = 7:9)))
bar <- list(a = list(a1 = 1:3, a2 = rep('z', 3)), b = list(b1 = list(b11 = c(-1,2,5), b12 = rep('b', 3)), b2 = list(b21 = list(b31 = -c(1,2,3)))), c = list(list(c21 = 3:1), list(c21 = 5:3)))

# wonderful: desired list (result from combining foo and bar)
wonderful <- list(
  a = list(
    a1 = c(foo$a$a1, bar$a$a1), 
    a2 = c(foo$a$a2, bar$a$a2)
    ),
  b = list(
    b1 = list(
      b11 = c(foo$b$b1$b11, bar$b$b1$b11),
      b12 = c(foo$b$b1$b12, bar$b$b1$b12)
      ),
    b2 = list(
      b21 = list(
        b31 = c(foo$b$b2$b21$b31, bar$b$b2$b21$b31)
        )
      )
    ),
  c = c(foo$c, bar$c)
)

str(foo)
#> List of 3
#>  $ a:List of 2
#>   ..$ a1: int [1:3] 1 2 3
#>   ..$ a2: chr [1:3] "a" "a" "a"
#>  $ b:List of 2
#>   ..$ b1:List of 2
#>   .. ..$ b11: num [1:3] 4 5 6
#>   .. ..$ b12: chr [1:3] "b" "b" "b"
#>   ..$ b2:List of 1
#>   .. ..$ b21:List of 1
#>   .. .. ..$ b31: num [1:3] 0 1 2
#>  $ c:List of 3
#>   ..$ :List of 1
#>   .. ..$ c21: int [1:3] 1 2 3
#>   ..$ :List of 1
#>   .. ..$ c21: int [1:3] 4 5 6
#>   ..$ :List of 1
#>   .. ..$ c21: int [1:3] 7 8 9

str(bar)
#> List of 3
#>  $ a:List of 2
#>   ..$ a1: int [1:3] 1 2 3
#>   ..$ a2: chr [1:3] "z" "z" "z"
#>  $ b:List of 2
#>   ..$ b1:List of 2
#>   .. ..$ b11: num [1:3] -1 2 5
#>   .. ..$ b12: chr [1:3] "b" "b" "b"
#>   ..$ b2:List of 1
#>   .. ..$ b21:List of 1
#>   .. .. ..$ b31: num [1:3] -1 -2 -3
#>  $ c:List of 2
#>   ..$ :List of 1
#>   .. ..$ c21: int [1:3] 3 2 1
#>   ..$ :List of 1
#>   .. ..$ c21: int [1:3] 5 4 3

str(wonderful)
#> List of 3
#>  $ a:List of 2
#>   ..$ a1: int [1:6] 1 2 3 1 2 3
#>   ..$ a2: chr [1:6] "a" "a" "a" "z" ...
#>  $ b:List of 2
#>   ..$ b1:List of 2
#>   .. ..$ b11: num [1:6] 4 5 6 -1 2 5
#>   .. ..$ b12: chr [1:6] "b" "b" "b" "b" ...
#>   ..$ b2:List of 1
#>   .. ..$ b21:List of 1
#>   .. .. ..$ b31: num [1:6] 0 1 2 -1 -2 -3
#>  $ c:List of 5
#>   ..$ :List of 1
#>   .. ..$ c21: int [1:3] 1 2 3
#>   ..$ :List of 1
#>   .. ..$ c21: int [1:3] 4 5 6
#>   ..$ :List of 1
#>   .. ..$ c21: int [1:3] 7 8 9
#>   ..$ :List of 1
#>   .. ..$ c21: int [1:3] 3 2 1
#>   ..$ :List of 1
#>   .. ..$ c21: int [1:3] 5 4 3
zx8754
  • 52,746
  • 12
  • 114
  • 209
Ramiro Magno
  • 3,085
  • 15
  • 30

3 Answers3

3

Here's a go at it:

library(purrr)

rec_map <- function(fizz, buzz) {
  if(is.atomic(fizz) | is.null(names(fizz))){
    c(fizz, buzz)
  } else {
    imap(fizz,
         ~rec_map(fizz[[.y]], buzz[[.y]]))
  }
}

temp <- rec_map(foo, bar)

all.equal(temp, wonderful)
#> [1] TRUE

I'm by no means a computer scientist, so take the solution with a grain of salt. I am not certain about the behavior desired when there are no names for one level, but then one level down there are names (e.g., foo$c). So I just combined the results (c()) if we encountered a level without names.

edit to take a number of lists:

prec_map <- function(...){
  dots <- list(...)
  first_el = dots[[1]]
  if(is.atomic(first_el) | is.null(names(first_el))){
    do.call(c, dots)
  } else {
    imap(first_el,
         function(el, nme){
           one_level_down <- map(dots, nme)
           do.call(prec_map, one_level_down)
         })
  }
}

temp <- prec_map(foo, bar)

all.equal(temp, wonderful)
[1] TRUE

I haven't tested it out thoroughly, but light testing looks like it gets the job done.

zack
  • 5,205
  • 1
  • 19
  • 25
  • Wonderful! How you think I could change `rec_map` into `prec_map` as in `map` and `pmap`, i.e. a parallel version of `rec_map`? – Ramiro Magno Feb 20 '19 at 16:40
  • 1
    @rmagno I intend to give this a look later today when I've some more time, sorry! – zack Feb 20 '19 at 17:30
  • 2
    `prec_map` could be expressed as `function(.l) reduce(.l, rec_map)` or `partial(reduce, .f = rec_map)` or `. %>% reduce(rec_map)`, used as`prec_map(list(foo, bar))` – Aurèle Feb 21 '19 at 14:29
  • 1
    i like @Aurèle's suggestion in using reduce to handle more lists. I've taken a little more of a top down approach above (rather that side-to-side). – zack Feb 21 '19 at 16:05
  • 1
    @zack: is the passed `buzz` in `imap(fizz, ~rec_map(fizz[[.y]], buzz[[.y]]), buzz)` doing anything? – Ramiro Magno Feb 22 '19 at 15:48
  • 1
    it doesn't look like it, i thought it was taking buzz from the dots argument to `imap` but it was pulling it from the arguments passed to the `rec_map` function. I've edited it out now. – zack Feb 22 '19 at 16:10
  • 1
    @zack: Nice, even simpler then. The `|` in `is.atomic(fizz) | is.null(names(fizz))` should be `||`, right? (though it won't make a difference in this case). – Ramiro Magno Feb 22 '19 at 17:10
  • either works in this case. They'll amount to the same because `is.null` and `is.atomic` return length 1 logical vectors. I suppose if you want to ensure you haven't accidentally passed a > length 1 vector to the `if` statement, you'd use `|` (it will show a warning in this case). If you'd rather just not worry about it, `||` would work. – zack Feb 22 '19 at 18:12
2

list_merge does something close to the requirements:

library(purrr)

res <- list_merge(foo, !!! bar)

all.equal(wonderful, list_merge(foo, !!! bar))
# [1] "Component “c”: Length mismatch: comparison on first 3 components"       
# [2] "Component “c”: Component 1: Component 1: Numeric: lengths (3, 6) differ"
# [3] "Component “c”: Component 2: Component 1: Numeric: lengths (3, 6) differ"

The only difference seems to be for elements that are unnamed lists (e.g. foo$c and bar$c), the elements of which are concatenated by position (foo$c[[1]] with bar$c[[1]], foo$c[[2]] with bar$c[[2]], and foo$c[[3]] left alone since there is no bar$c[[3]]... rather than c(foo$c, bar$c)).


And a parallel version could be:

plist_merge <- function(.l) {
  reduce(.l, ~ list_merge(.x, !!! .y))
}

all.equal(
  plist_merge(list(foo, bar)),
  list_merge(foo, !!! bar)
)
# [1] TRUE
Aurèle
  • 12,545
  • 1
  • 31
  • 49
0

After thinking a bit more about this problem in general... and after some inspiration from dplyr's joins, here's three joins for lists for my own future reference:

  • lst_left_join
  • lst_right_join
  • lst_inner_join
library(purrr)

#
# Inspired by dplyr's joins: https://r4ds.had.co.nz/relational-data.html#inner-join
# Here's some (more or less equivalent) list joins
# 
lst_left_join <- function(lst_x, lst_y) {
  if(is.atomic(lst_x) || is.null(names(lst_x))){
    c(lst_x, lst_y)
  } else {
    imap(lst_x, ~lst_left_join(lst_x[[.y]], lst_y[[.y]]))
  }
}

plst_left_join <- function(.l) reduce(.l, lst_left_join)

lst_right_join <- function(lst_x, lst_y) {
  if(is.atomic(lst_y) || is.null(names(lst_y))){
    c(lst_x, lst_y)
  } else {
    imap(lst_y, ~lst_right_join(lst_x[[.y]], lst_y[[.y]]))
  }
}

plst_right_join <- function(.l) reduce(.l, lst_right_join)

lst_inner_join <- function(lst_x, lst_y) {
  if(is.atomic(lst_y) || is.null(names(lst_y))){
    c(lst_x, lst_y)
  } else {
    common_names <- intersect(names(lst_x), names(lst_y))
    names(common_names) <- common_names # so that map preserves names
    map(common_names, ~lst_inner_join(lst_x[[.x]], lst_y[[.x]]))
  }
}
plst_inner_join <- function(.l) reduce(.l, lst_inner_join)

# Input lists: foo and bar.
foo <- list(x1 = 1:2, x3 = 30+5:6)
bar <- list(x1 = 10+1:2, x2 = 10+3:4)

# Output lists: r1, r2 and r3.
r1 <- lst_left_join(foo, bar)
r2 <- lst_right_join(foo, bar)
r3 <- lst_inner_join(foo, bar)

str(r1)
#> List of 2
#>  $ x1: num [1:4] 1 2 11 12
#>  $ x3: num [1:2] 35 36
str(r2)
#> List of 2
#>  $ x1: num [1:4] 1 2 11 12
#>  $ x2: num [1:2] 13 14
str(r3)
#> List of 1
#>  $ x1: num [1:4] 1 2 11 12
Ramiro Magno
  • 3,085
  • 15
  • 30