I need your help about Queue memory. 1) I choose Queue as my data structure because I have one thread to feed data to the queue and another thread will take the data 2) The two threads designed to run for days 3) I don't want to limit the queue size, the size of queue could be really long, say ~10k which occupy 10GB. This is fine 4) Problem is when I finish shrink the q size by get() to only 20 items which occupy only ~100MB in memory. I print the size and I'm sure there is only 20 items. 5) But in system level, the whole process still occupy ~10GB
I tried to call
gc.collect()
by myself, memory doesn't change. So my wild guess is that those item from get() is destroyed. And the thread is always running and python would not decrease capacity of the queue.
My question is: Is there anyway to free those memory which the queue doesn't use for now? I can't find any api to do that.
Update 1
Ubuntu 16.04, python 2.7.12 I did some experiment today. My observation is that the q size is empty, however the system memory is occupied about 84M. Here is some code to reproduce my result.
First shoot: del
import Queue
q = Queue.Queue()
length = 10000000
buffer_size = 1000
index = 0
while index < length:
q.put_nowait(1)
index += 1
key = raw_input('finish insert, press key to pop')
while q.qsize() > buffer_size:
a = q.get()
del a
print 'after pop, q size = ', q.qsize()
raw_input('let me del the q')
del q
key = raw_input('finish delete')
Second shoot: clear()
import Queue
q = Queue.Queue()
length = 10000000
buffer_size = 1000
index = 0
while index < length:
q.put_nowait(1)
index += 1
key = raw_input('finish insert, press key to pop')
while q.qsize() > buffer_size:
a = q.get()
del a
print 'after pop, q size = ', q.qsize()
raw_input('let me del the q')
with q.mutex:
q.queue.clear()
print 'q size = ', q.qsize()
key = raw_input('finish delete')
Third shoot: Queue()
import Queue
q = Queue.Queue()
length = 10000000
buffer_size = 1000
index = 0
while index < length:
q.put_nowait(1)
index += 1
key = raw_input('finish insert, press key to pop')
while q.qsize() > buffer_size:
a = q.get()
del a
print 'after pop, q size = ', q.qsize()
raw_input('let me del the q')
q = Queue.Queue()
print 'q size = ', q.qsize()
key = raw_input('finish delete')
Fourth shoot: gc.collect()
import Queue
import gc
q = Queue.Queue()
length = 10000000
buffer_size = 1000
index = 0
while index < length:
q.put_nowait(1)
index += 1
key = raw_input('finish insert, press key to pop')
while q.qsize() > buffer_size:
a = q.get()
del a
print 'after pop, q size = ', q.qsize()
raw_input('let me del the q')
#del q
#with q.mutex:
# q.queue.clear()
q = Queue.Queue()
print 'q size = ', q.qsize()
raw_input('let me gc.collect')
gc.collect()
raw_input('how about now?')
These four ways would not release the memory in the queue.Can anyone tell me what I'm doing wrong? Many thanks!
Some thought
Seem like python Queue will reserve the largest memory capacity in its life circle and try to re-use the memory without malloc memory. Compared with data structure in C++ stl vector as example. Double the memory when the (size == capacity) and reduce capacity to half if the (size / capacity == 0.25). I expect the dynamic data structure will have this feature. Is there any way I could do that? Or the python queue is designed by this way?