7

I was unaware that creating a new list-column with dplyr::mutate() with a single-element list actually deep-copies the element to fill the tibble length (see t3). Why is that?

If I specify the correct length explicitly (t4) or pass it when creating the tibble (t5), the elements are passed by reference.

Consider the following case, where a list encloses a tibble with a large vector.

library(tidyverse)
library(pryr)

t1 <- tibble(a = 1:4)
t2 <- tibble(b = 1:1e6)
t3 <- mutate(t1, tl = list(t2))
t4 <- mutate(t1, tl = rep(list(t2), n()))
t5 <- tibble(a = 1:4, tl = list(t2))

object_size(t2)
#> 4 MB
object_size(t3)
#> 16 MB
object_size(t4)
#> 4 MB
object_size(t5)
#> 4 MB

Created on 2019-02-22 by the reprex package (v0.2.1)

Boann
  • 48,794
  • 16
  • 117
  • 146
mjktfw
  • 840
  • 6
  • 14

0 Answers0