I am trying to debug a memory problem with my large Python application. Most of the memory is in numpy
arrays managed by Python classes, so Heapy etc. are useless, since they do not account for the memory in the numpy
arrays. So I tried to manually track the memory usage using the MacOSX (10.7.5) Activity Monitor (or top
if you will). I noticed the following weird behavior. On a normal python
interpreter shell (2.7.3):
import numpy as np # 1.7.1
# Activity Monitor: 12.8 MB
a = np.zeros((1000, 1000, 17)) # a "large" array
# 142.5 MB
del a
# 12.8 MB (so far so good, the array got freed)
a = np.zeros((1000, 1000, 16)) # a "small" array
# 134.9 MB
del a
# 134.9 MB (the system didn't get back the memory)
import gc
gc.collect()
# 134.9 MB
No matter what I do, the memory footprint of the Python session will never go below 134.9 MB again. So my question is:
Why are the resources of arrays larger than 1000x1000x17x8 bytes (found empirically on my system) properly given back to the system, while the memory of smaller arrays appears to be stuck with the Python interpreter forever?
This does appear to ratchet up, since in my real-world applications, I end up with over 2 GB of memory I can never get back from the Python interpreter. Is this intended behavior that Python reserves more and more memory depending on usage history? If yes, then Activity Monitor is just as useless as Heapy for my case. Is there anything out there that is not useless?