I have a question if whether the following code is executed in place or has extra space complexity. Given that sentence was a string initially. Thanks appreciate the help
sentence = "hello world"
sentence = sentence.split()
I have a question if whether the following code is executed in place or has extra space complexity. Given that sentence was a string initially. Thanks appreciate the help
sentence = "hello world"
sentence = sentence.split()
In python strings are immutable objects, which means that they cannot change at all "in-place". All actions on them essentially take up new memory space, and hopefully, the old unused ones are deleted by python's garbage collecting process (if there are no more references to those objects). One method to see it for yourself is this:
>>> a = 'hello world'
>>> id(a)
1838856511920
>>> b = a
>>> id(b)
1838856511920
>>> a += '!'
>>> id(a)
1838856512944
>>> id(b)
1838856511920
As you can see, when b
and a
are referring to the same underlying objects, their id
in memory is the same, but as soon as one of them changes, it now has a new id
- a new space in memory. The object that was left unchanged (b
) still has the same place-id.
To check it in your example:
>>> sentence = "hello world"
>>> id(sentence)
1838856521584
>>> sentence = sentence.split()
>>> id(sentence)
1838853280840
We can once again see that those objects are not taking the same memory. We can further explore just how much space they take up:
>>> import sys
>>> sentence = "hello world"
>>> sys.getsizeof(sentence)
60
>>> sentence = sentence.split()
>>> sys.getsizeof(sentence)
160
As noted in comments, the operation can not be "in-place", as that would mean within the same data structure, but you are abviously creating a new data structure (a list) from the string. I will assume that your actual question was whether the substrings returned by split
will use the same backing array of characters as the original immutable string.1)
A quick experiment seems to suggest that they do not.
In [1]: s = (("A" * 100000) + " ") * 50000
In [2]: len(s)
Out[2]: 5000050000
In [3]: l = s.split()
After the first step, top
shows that the ipython
process uses ~30% of my memory, and after the split
it uses ~60%, so the backing array, taking up the bulk of the memory, is not reused. Of course, this may be implementation specific. I was using IPython 5.5.0 (based on Python 3.6.8), but get the same result with Python 2.7.15, too. This also seems to apply to string slicing.
1) Precisely because the strings are immutable this would be possible, and to the best of my knowledge other languages, like Java, do this, although I can currently not test it.)
Note: The use of sys.getsizeof
is a bit misleading here, as that seems to measures only the size of the actual data structure, not the elements contained therein.
In [4]: sys.getsizeof(s)
Out[4]: 5000050049
In [5]: sys.getsizeof(l)
Out[5]: 433816
According to that, the list takes up only a fraction of the space of the original splitted string, but as noted above, the actual memory consumption doubled.