I have a dataset about 16GB. To reduce the RAM usage I transformed it into disk.frame
After few manipulations - just mutate 10 variables I tried to move new table to RAM using collect
function.
The error message is the following
Error: MultisessionFuture (future_lapply-1) failed to call grmall() on cluster RichSOCKnode #3 (PID 10112 on localhost ‘localhost’). The reason reported was ‘error writing to connection’. Post-mortem diagnostic: Failed to determined whether a process with this PID exists or not, i.e. cannot infer whether localhost worker is alive or not. The total size of the 11 globals exported is 313.11 KiB. The three largest globals are ‘get_chunk.disk.frame’ (240.04 KiB of class ‘function’), ‘...future.FUN’ (58.30 KiB of class ‘function’) and ‘%>%’ (7.47 KiB of class ‘function’)
In addition: Warning messages:
1: In doTryCatch(return(expr), name, parentenv, handler) :
NAs introduced by coercion
2: In doTryCatch(return(expr), name, parentenv, handler) :
NAs introduced by coercion
3: In doTryCatch(return(expr), name, parentenv, handler) :
NAs introduced by coercion
4: In doTryCatch(return(expr), name, parentenv, handler) :
NAs introduced by coercion
5: In doTryCatch(return(expr), name, parentenv, handler) :
NAs introduced by coercion
6: In doTryCatch(return(expr), name, parentenv, handler) :
NAs introduced by coercion
7: In doTryCatch(return(expr), name, parentenv, handler) :
NAs introduced by coercion
Is there limitation for the number of data manipulations with disk.frame? Is it possible to update disk.frame after data manipulation without moving to RAM?