I was unaware that creating a new list-column with dplyr::mutate()
with a single-element list actually deep-copies the element to fill the tibble length (see t3
). Why is that?
If I specify the correct length explicitly (t4
) or pass it when creating the tibble (t5
), the elements are passed by reference.
Consider the following case, where a list encloses a tibble with a large vector.
library(tidyverse)
library(pryr)
t1 <- tibble(a = 1:4)
t2 <- tibble(b = 1:1e6)
t3 <- mutate(t1, tl = list(t2))
t4 <- mutate(t1, tl = rep(list(t2), n()))
t5 <- tibble(a = 1:4, tl = list(t2))
object_size(t2)
#> 4 MB
object_size(t3)
#> 16 MB
object_size(t4)
#> 4 MB
object_size(t5)
#> 4 MB
Created on 2019-02-22 by the reprex package (v0.2.1)