1

I am working on 3D image segmentation with deep learning. Basically, I need to 1/ pad a numpy array, 2/ process the array, 3/ unpad the array.

dataArray = np.pad(dataArray, 25, mode='constant', constant_values=0) # pad
processedArray = my_process(dataArray) # process
processedArray = processedArray[25:-25, 25:-25, 25:-25, :] # unpad

Problem is, processedArray is very large (464,928,928,928,10) and I run into out of memory when executing the unpadding. I imagine that the unpadding allocates new memory? Am I right? How could I proceed so that no new memory is allocated? In other words, so that index points to unpadded elements, without copying the elements?

Information that might help: above lines are executed in a function, and processedArray is returned

Manu
  • 186
  • 3
  • 11

2 Answers2

0

Maybe you're running out of memory because once you call the function with this array, a copy is being created inside the function which doubles your memory. So, just don't create extra arrays.

You can keep a global copy of the array. Just apply the operations on the global array without creating extra copy.

import gc

global processedArray # before all your assignment starts

inside my_process()

def my_process():
   global processedArray
   # do all operations on processedArray
global processedArray
processedArray = np.pad(dataArray, 25, mode='constant', constant_values=0) # pad
my_process()

del dataArray() # delete not needed arrays to make more space
gc.collect()

global processedArray
processedArray = processedArray[25:-25, 25:-25, 25:-25, :] # unpad

But, still you'll run out of memories if your my_process calls more library functions which makes copies of the processedArray. Try to apply every operations on a global array without making any copy.

Zabir Al Nazi
  • 10,298
  • 4
  • 33
  • 60
0

A possible solution to the memory problem is using short instead of float as your numpy data type. You can try this.

dataArray = np.pad(dataArray, 25, mode='constant', constant_values=0) # pad
processedArray = my_process(dataArray).astype(np.short) # process
processedArray = processedArray[25:-25, 25:-25, 25:-25, :] # unpad
processedArray = processedArray.astype(np.float32) #Converting to float type again

Alternatively, you can delete your dataArray to create space for processedArray.

dataArray = np.pad(dataArray, 25, mode='constant', constant_values=0) # pad
del dataArray #deleting dataArray to claim memory
processedArray = my_process(dataArray) # process
processedArray = processedArray[25:-25, 25:-25, 25:-25, :] # unpad
Hamza Khurshid
  • 765
  • 7
  • 18