3

I'm wondering how efficiently reticulate handles memory with python objects.

Suppose I have a 5GB pandas dataframe object called data_pandas, in reticulate::python and I'd like to make an analysis with R.

When I call the object from R like py$data_pandas, does it make a copy of this dataframe into R data.frame object internally (i.e. make another 5GB data.frame in R)?

And vice versa (calling R data.frame from python)?

Matthew Son
  • 1,109
  • 8
  • 27

1 Answers1

1

I'm no expert, but it seems from the vignette on arrays that reticulate makes at least two copies of every python object: "R arrays are only copied to Python when they need to be, otherwise data are shared. Python arrays are always copied when moved into R arrays. This can sometimes lead to three copies of any one array in memory at any one time (at the moment this was written). Future versions will reduce that copy overhead to two." (From https://rstudio.github.io/reticulate/articles/arrays.html)

Ethan Bass
  • 426
  • 3
  • 7