Map Raggedly to Depth but Stop at Sentinel

Question

Motivation

I am developing an R package with an essential helper function "make()". This make() function accepts a ragged list, then maps function foo() onto the penultimate nodes ("twigs") in the list: those whose children are leaves. So far purrr::map_depth(..., .depth = -2, .ragged = TRUE) seems ideal.

There is one catch: the traversal must stop at a node that is a done_box — or some sentinel class of my own creation — regardless of its depth, even though that node is also (ie. inherits from) a list of arbitrary depth. The function foo() will then map the node conditionally, based on its class.

Challenge

Unfortunately, the inflexibility of purrr::vec_depth() induces an error when it encounters objective nodes:

Error in `.f()`:
! `x` must be a vector
Run `rlang::last_error()` to see where the error occurred.

Bandaid Solution

As such, I have tentatively written a variation .map_depth() on the source code for map_depth(), where .list_depth() replaces vec_depth() to handle objective nodes. See the Code section for the code.

I can then create a sentinel class my_sentinel, which "boxes" (ie. wraps in a list) a quosure for the expression that generates the node's value. Since this my_sentinel inherits from list, then .list_depth() will give its quosure a depth of 1 and give the sentinel itself a depth of 2.

At a .depth of -2, the .map_depth() function will thus target the my_sentinel node itself. By design, foo() will identify its class as "my_sentinel", and it will evaluate the quosure to "unbox" the value.

Drawback

Unfortunately, the environment of the quosure may change since the quosure was first captured. So when .map_depth() is finally called, it might "unbox" the wrong value: one that was wrongly updated.

As such, the behavior could prove unstable for end users!

Question

Is there a cleaner or canonical way to purrr::map_*() a function foo() to the "twigs" (ie. nodes at .depth = -2) of a list, while stopping at nodes that are sentinel objects like done_box?

Suggestion

I'm curious about purrr::map_if(), with some predicate .p that tests for both class and depth. However, I lack the expertise to confirm this as feasible, let alone canonical.

Code

For my_sentinel:

my_sentinel <- function(x) {
  x_quo <- rlang::enquo0(x)
  
  # No pun intended.
  x_box <- list(x_quo)
  
  structure(x_box,
    class = c("my_sentinel", class(x_box))
  )
}

For .map_depth():

# A variation on 'purrr::map_depth()' that accommodates objective leaves.
.map_depth <- function(.x, .depth, .f, ..., .ragged = FALSE) {
  if (!rlang::is_integerish(.depth, n = 1, finite = TRUE)) {
    abort("`.depth` must be a single number")
  }
  if (.depth < 0) {
    .depth <- .list_depth(.x) + .depth
  }
  .f <- purrr::as_mapper(.f, ...)
  .map_depth_rec(.x, .depth, .f, ..., .ragged = .ragged, .atomic = FALSE)
}


# A variation on 'purrr:::map_depth_rec()' that accommodates objective leaves.
.map_depth_rec <- function(.x, .depth, .f, ..., .ragged, .atomic) {
  if (.depth < 0) {
    rlang::abort("Invalid depth")
  }
  # TODO: Must this be addressed too (unlikely)?
  if (.atomic) {
    if (!.ragged) {
      rlang::abort("List not deep enough")
    }
    return(purrr::map(.x, .f, ...))
  }
  if (.depth == 0) {
    return(.f(.x, ...))
  }
  if (.depth == 1) {
    return(purrr::map(.x, .f, ...))
  }
  .atomic <- rlang::is_atomic(.x)
  purrr::map(.x, function(x) {
    .map_depth_rec(x, .depth - 1, .f, ..., .ragged = .ragged, 
                    .atomic = .atomic)
  })
}


# A variation on 'purrr::vec_depth()' that accommodates objective leaves.
.list_depth <- function(x) {
  if (rlang::is_null(x)) {
    0L
  }
  # TODO: Address this so a vector is treated as a leaf (or must users esc() for that?).
  else if (rlang::is_atomic(x)) {
    1L
  }
  else if (rlang::is_list(x)) {
    depths <- purrr::map_int(x, .list_depth)
    1L + max(depths, 0L)
  }
  # Accommodate objective leaves.
  else if (is.object(x)) {
    # TODO: Check if this should be 1L or (as I suspect) 0L!
    1L
  }
  else {
    rlang::abort("`x` must be a vector or an object")
  }
}

**Update**: Looks like [`purrr::map_if()`](https://purrr.tidyverse.org/reference/map_if.html) is the way to go, with a predicate `.p = ~.list_depth(.) == 2 || inherits(., "my_sentinel")` to test the class, and `.f = foo` to map it. Or perhaps a modification to `.list_depth()` that returns `2L` when the object is a `my_sentinel`, which simplifies the predicate to `.p = ~.list_depth(.) == 2`. — Greg, Aug 31 '22 at 04:11
The only issue is this: how much _performance_ do I sacrifice, by making `map_if()` check *every* node with `.p`, rather than some alternative with `map_depth()` that targets only one level? — Greg, Aug 31 '22 at 08:10

score 1 · Answer 1 · answered Sep 04 '22 at 15:22

Just mentioning rrapply() in package rrapply (an extended version of base rapply), which may already provide the functionality you are looking for.

Using the following dummy list, since no data is provided in the question:

## dummy data
l <- list(
  list("initial_node"),
  list(list("initial_node")),
  structure(list("initial_node"), class = c("list", "my_sentinal")),
  structure(list(list("initial_node")), class = c("list", "my_sentinal"))
)

In a first call, the my_sentinal class is recursively propagated to each sublist. In a second call, some function f is applied to all lists with no sublists that do not inherit from the my_sentinal class. NB: this can probably be combined into a single call with some effort, but split into two separate calls the code is likely easier to follow and understand.

## propagate my_sentinal class to deeper sublists
l1 <- rrapply(
  l, 
  classes ="list",
  condition = \(x) inherits(x, "my_sentinal"),
  f = \(x) {
    if(is.list(x[[1]])) x[] <- lapply(x, structure, class = c("list", "my_sentinal"))
    return(x)
  },
  how = "recurse"
)

## apply function to non-sentinal pen-ultimate nodes 
l2 <- rrapply(
  l1, 
  classes = "list",
  condition = \(x) !inherits(x, "my_sentinal") && !is.list(x[[1]]),
  f = \(x) lapply(x, \(xi) "processed_node")
)

str(l2)
#> List of 4
#>  $ :List of 1
#>   ..$ : chr "processed_node"
#>  $ :List of 1
#>   ..$ :List of 1
#>   .. ..$ : chr "processed_node"
#>  $ :List of 1
#>   ..$ : chr "initial_node"
#>   ..- attr(*, "class")= chr [1:2] "list" "my_sentinal"
#>  $ :List of 1
#>   ..$ :List of 1
#>   .. ..$ : chr "initial_node"
#>   .. ..- attr(*, "class")= chr [1:2] "list" "my_sentinal"
#>   ..- attr(*, "class")= chr [1:2] "list" "my_sentinal"

Disclaimer: I am also the maintainer of the rrapply-package.