I try to save every single connected component out of a big 3D-array into separate arrays. These separate arrays, one per connected component, I want to save as ".npy" files.
For the connected components I use cc3d.connected_components()
(https://github.com/seung-lab/connected-components-3d/). The output is an 3D array containing integers from 0 to approx. 1500, one for every component. So I'll need to save ~1500 arrays. (Since I can't give you the original data, I'll use an array containing random integers with the same shape as the output from the connected components analysis.)
My approach was to define a function bound_box(i)
which creates an array, that contains the bounding box around a single connected component and saves it using np.save()
. And then to loop this function over all connected components in the primary 3D array.
def bound_box(i):
component = np.nonzero(arr == i)
z = slice(component[0].min(),component[0].max()+1)
x = slice(component[1].min(),component[1].max()+1)
y = slice(component[2].min(),component[2].max()+1)
np.save('path.../arr'+str(i), arr[z,x,y])
arr = np.random.randint(0,1500,(721,1285,1285))
for i in np.unique(arr):
bound_box(i)
I tested the bound_box()
function to save single arrays, which worked perfectly well. The arrays were about 500-700KB each. Using the random data the size is a bit bigger (up to 2GB), which is expected since there aren't any connected components anymore. But still my function seems to work as expected.
BUT if I try to loop a lot of memory is allocated to python and then when I eventually run out of memory the program crashes without writing any ".npy" files. So I'm quite sure the problem is the for loop, but I wasn't able to figure our a solution by myself. I'd really appreciate, if someone could help me out and tell me what I'm doing wrong here!
What I've already tried:
- using
del
to delete the variables used in my function in the hope to clear some memory. This had no noticeable effect.
def bound_box(i):
component = np.nonzero(arr == i)
z = slice(component[0].min(),component[0].max()+1)
x = slice(component[1].min(),component[1].max()+1)
y = slice(component[2].min(),component[2].max()+1)
np.save('path.../arr'+str(i), arr[z,x,y])
del x
del y
del z
del grain