I am working on a Kedro 0.17.2 project that is running on out-of-memory issues and I'm trying to reduce the memory footprint.
I'm doing the profiling by using mprof
from the memory-profiler
library and I noticed that there is always a child process and memory seems to duplicate in the main process after the first computation in the node that is running. Is it possible that Kedro is duplicating the dataframes in memory? And, if so, is there a way to avoid this?
Notes:
- I'm using the
SequentialRunner
- I'm not using the
is_async
cli option - I'm not using either multithreading or multiprocessing in the node execution