0

Is the memory shared between the StringIO if I do that? I have the feeling it is because the memory of the python process did not increase on line 6.

In [1]: from StringIO import StringIO
In [2]: s = StringIO()
In [3]: s.write('abcd'*10000000) # memory increases
In [4]: s.tell()
Out[4]: 40000000
In [5]: s.seek(0)
In [6]: a = StringIO(s.read()) # memory DOES NOT increase
In [7]: a.tell()
Out[7]: 0
In [8]: a.read(10)
Out[8]: 'abcdabcdab'

However my concern is that when I delete those 2 variables, the memory consumption of the python process does not decrease anymore... why ? Does this code create a memory leak ?

When I just used one variable, the memory is well freed when I delete the variable.

I'd be curious to better understand what is going on here. Thanks.

Michael
  • 8,357
  • 20
  • 58
  • 86

1 Answers1

5

A StringIO() object does not make a copy of a string passed to it. There is no need to, as strings are not mutable.

When reading data from a StringIO() object in chunks, new string objects are created that are substrings from the original input string.

Memory consumption never goes down immediately when freeing objects. Memory allocation is only redistributed as needed by the OS, and many types of (small) objects can be interned by Python for efficiency and are never freed, only reused.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Ok yes I understand why reading the object in chunks will create new string objects. It's fine. I just wanna make sure that the original data string is not duplicated everytime I create a new StringIO, which looks to be the case right? I am trying to have one writer and several readers without duplicating the raw data. – Michael Jul 06 '13 at 13:24
  • It works this way with StringIO but it seems to be different with cStringIO. Using cStringIO seems to duplicate the string in memory because it increases everytime I do: reader = StringIO() ; reader.write(source.getvalue()) Interesting. – Michael Jul 06 '13 at 14:58
  • Note what you are doing. You are creating a writer object instead. Use `StringIO(source.getvalue())` instead. – Martijn Pieters Jul 06 '13 at 15:31
  • `reader.write(source.getvalue())`? You shouldn't write to a reader. – martineau Jul 06 '13 at 19:00
  • oh right, that's a good point thanks. However, I have tried reader = StringIO(source.getvalue()) and when StringIO comes from cStringIO, it still seems to be duplicated given that the memory increases by the same size than the first time the string was created. Well, I guess that's just because the implementation of StringIO may differ a little bit in cStringIO. – Michael Jul 07 '13 at 20:44