2

It seems that when overwriting a key in a shelve, under certain circumstances the shelve size unexpectedly keeps growing larger. It is as if some data in a shelve ends up not having a reference to it, like in a memory leak. It seems to have something to do with appending to lists. Does anyone know why? Here is a minimal example:

import shelve, os
import numpy as np

def run():
    list_array = []
    expected_size = 0
    for i in range(5):
        array_100mb = np.zeros(1024*1024*100//8)
        list_array.append(array_100mb)
        expected_size = 100*len(list_array)
        with shelve.open('shelve_test') as s:
            s['val'] = list_array
        size_mb = os.path.getsize('shelve_test.dat') // 1024 // 1024
        print(f'Iteration {i}: \t shelve size is {size_mb}Mb; \t expected size is     {expected_size}Mb')
        
for j in range(5):
    run()
    print()

This outputs:

Iteration 0:     shelve size is 100Mb;   expected size is 100Mb
Iteration 1:     shelve size is 300Mb;   expected size is 200Mb
Iteration 2:     shelve size is 600Mb;   expected size is 300Mb
Iteration 3:     shelve size is 1000Mb;  expected size is 400Mb
Iteration 4:     shelve size is 1500Mb;  expected size is 500Mb

Iteration 0:     shelve size is 1500Mb;  expected size is 100Mb
Iteration 1:     shelve size is 1700Mb;  expected size is 200Mb
Iteration 2:     shelve size is 2000Mb;  expected size is 300Mb
Iteration 3:     shelve size is 2400Mb;  expected size is 400Mb
Iteration 4:     shelve size is 2900Mb;  expected size is 500Mb

Iteration 0:     shelve size is 2900Mb;  expected size is 100Mb
Iteration 1:     shelve size is 3100Mb;  expected size is 200Mb
Iteration 2:     shelve size is 3400Mb;  expected size is 300Mb
Iteration 3:     shelve size is 3800Mb;  expected size is 400Mb
Iteration 4:     shelve size is 4300Mb;  expected size is 500Mb

Iteration 0:     shelve size is 4300Mb;  expected size is 100Mb
Iteration 1:     shelve size is 4500Mb;  expected size is 200Mb
Iteration 2:     shelve size is 4800Mb;  expected size is 300Mb
Iteration 3:     shelve size is 5200Mb;  expected size is 400Mb
Iteration 4:     shelve size is 5700Mb;  expected size is 500Mb

Iteration 0:     shelve size is 5700Mb;  expected size is 100Mb
Iteration 1:     shelve size is 5900Mb;  expected size is 200Mb
Iteration 2:     shelve size is 6200Mb;  expected size is 300Mb
Iteration 3:     shelve size is 6600Mb;  expected size is 400Mb
Iteration 4:     shelve size is 7100Mb;  expected size is 500Mb

Python version is 3.6.6

rinspy
  • 386
  • 1
  • 10
  • reproducable on `Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)] on win32` seems like you found a leak – bb1950328 Aug 04 '20 at 11:53

0 Answers0