2

I have data in a nested list structure in R and I'd like to use a lookup table to change names no matter where they are in the structure. Example

# build up an example
x <- as.list(c("a" = NA))
x[[1]] <- vector("list", 4)
names(x[[1]]) <- c("b","c","d","e")
x$a$b <- vector("list", 2)
names(x$a$b) <- c("d","f")
x$a$c <- 3
x$a$d <- 27
x$a$e <- "d"
x$a$b$d <- "data"
x$a$b$f <- "more data"

# make a lookup table for names I want to change from; to
lkp <- data.frame(matrix(data = c("a","z","b","bee","d","dee"), 
                         ncol = 2, 
                         byrow = TRUE), stringsAsFactors = FALSE)

names(lkp) <- c("from","to")

Output from the above

> x
$a
$a$b
$a$b$d
[1] "data"

$a$b$f
[1] "more data"


$a$c
[1] 3

$a$d
[1] 27

$a$e
[1] "d"


> lkp
  from  to
1    a   z
2    b bee
3    d dee

Here is what I came up with to do this for only the first level:

> for(i in 1:nrow(lkp)){
+   names(x)[names(x) == lkp$from[[i]]] <- lkp$to[[i]]
+ }
> x
$z
$z$b
$z$b$d
[1] "data"

$z$b$f
[1] "more data"


$z$c
[1] 3

$z$d
[1] 27

$z$e
[1] "d"

So that works fine but uses a loop and only gets at the first level. I've tried various versions of the *apply world but have not yet been able to get something useful.

Thanks in advance for any thoughts

EDIT: Interestingly rapply fails miserably (or, I fail miserably in my attempt!) when trying to access and modify names. Here's an example of just trying to change all names the same

> namef <- function(x) names(x) <- "z"
> rapply(x, namef, how = "list")
$a
$a$b
$a$b$d
[1] "z"

$a$b$f
[1] "z"


$a$c
[1] "z"

$a$d
[1] "z"

$a$e
[1] "z"
Tim
  • 79
  • 8
  • may I ask what do you need in the final end? Do you really need such a nested list? – Roman Jul 24 '20 at 14:01
  • @Roman, thanks for question. Yes, I'm exporting to XML so I need to maintain the nested structure. Big picture: I'm changing long names to short names that will match xml tags. I *could* just build the nested lists with the short names initially, but I've gone this far with the long names I was hoping there was a straightforward way of doing this. – Tim Jul 24 '20 at 14:07

2 Answers2

2

I used a character vector for look-up instead of you data.frame, but it will be easy to change it if you really want a data.frame.

lkp2 <- lkp$to
names(lkp2) <- lkp$from

rename <- function(nested_list) {
    found <- names(nested_list) %in% names(lkp2)
    names(nested_list)[found] <- lkp2[names(nested_list)[found]]
    nested_list %>% map(~{
        if (is.list(.x)) {
            rename(.x)
        } else {
            .x
        }
    })
}
rename(x)
# $z
# $z$bee
# $z$bee$dee
# [1] "data"
#
# $z$bee$f
# [1] "more data"
#
#
# $z$c
# [1] 3
#
# $z$dee
# [1] 27
#
# $z$e
# [1] "d"

I am not sure this is the best way to do it, but it seems to do the job, and if you're only working with small lists (like XML documents) then there is no need to worry much about performance.

You might want to name the function with a better name.

Vongo
  • 1,325
  • 1
  • 17
  • 27
  • 1
    Excellent! Thank you, I think this does the trick. In the event that others might want to try this, `map` needs `library(purrr)` and it looks like purrr takes care of `%>%` too (I didn't have to load magrittr after loading purrr). – Tim Jul 24 '20 at 14:54
  • 1
    If you need to stick to vanilla R, the same result can be obtained with `lapply`. I got used to using the good parts of the `tidyverse`, and `purrr` is the best of those. Sorry, I should have mentioned that I was using it. – Vongo Jul 24 '20 at 14:59
1

Using an external package you can also do this with rrapply in the rrapply-package (extension of base rapply):

library(rrapply)  ## v1.2.1

rrapply(list(x), 
        classes = "list", 
        f = function(x) { 
          newnames <- lkp$to[match(names(x), lkp$from)]
          names(x)[!is.na(newnames)] <- newnames[!is.na(newnames)]
          return(x)
        },
        how = "recurse"
)[[1]]
#> $z
#> $z$bee
#> $z$bee$dee
#> [1] "data"
#> 
#> $z$bee$f
#> [1] "more data"
#> 
#> 
#> $z$c
#> [1] 3
#> 
#> $z$dee
#> [1] 27
#> 
#> $z$e
#> [1] "d"

Here, the f function achieves essentially the same as OP's for-loop. how = "recurse" tells the function to continue recursion after the application of f.

Note that the input is wrapped as list(x) so that the f function also modifies the name(s) of the list itself.


Update rrapply v1.2.5 contains a dedicated option how = "names" to replace names in a nested list, which is a bit less convoluted:

rrapply(
  x, 
  f = function(x, .xname) {
    newname <- lkp$to[match(.xname, lkp$from)]
    return(ifelse(is.na(newname), .xname, newname))
  },
  how = "names"
)
#> $z
#> $z$bee
#> $z$bee$dee
#> [1] "data"
#> 
#> $z$bee$f
#> [1] "more data"
#> 
#> 
#> $z$c
#> [1] 3
#> 
#> $z$dee
#> [1] 27
#> 
#> $z$e
#> [1] "d"
Joris C.
  • 5,721
  • 3
  • 12
  • 27
  • 1
    very interesting, thanks for adding this solution. I didn't know about `rrapply`. I also like the use of `match` – Tim Jul 24 '20 at 17:00