53

Do you know of a Python library which provides mutable strings? Google returned surprisingly few results. The only usable library I found is http://code.google.com/p/gapbuffer/ which is in C but I would prefer it to be written in pure Python.

Edit: Thanks for the responses but I'm after an efficient library. That is, ''.join(list) might work but I was hoping for something more optimized. Also, it has to support the usual stuff regular strings do, like regex and unicode.

codeforester
  • 39,467
  • 16
  • 112
  • 140
Ecir Hana
  • 10,864
  • 13
  • 67
  • 117
  • 9
    Lists work pretty well for this purpose. – Aaron Yodaiken May 13 '12 at 14:49
  • A couple of links: [LINK1](http://mail.python.org/pipermail/tutor/2003-August/024485.html), [LINK2](http://www.skymind.com/~ocrow/python_string/) – digEmAll May 13 '12 at 15:00
  • 5
    Can you please explain, why do you need mutable strings? What is the use case? – Zaur Nasibov May 13 '12 at 15:53
  • 2
    @BasicWolf may be for memory-efficient replacements of chars inside the string? We're avoiding to create a copy of string. – chuwy Oct 29 '13 at 13:59
  • 5
    @chuwy Well, there is a bytearray for those purposes. A string in Python is a-priori not a "memory-efficient" sequence, but rather concurrency-efficient. Consider this: you can always be sure, that no matter what a string modification operation on original string does not affect it. So, no problems in concurrency, thread safety etc. – Zaur Nasibov Oct 29 '13 at 14:12
  • @BasicWolf this would be appropriate addition to the top-voted answer. – chuwy Oct 30 '13 at 13:50
  • Here's the truth mutable strings are used by working professionals, this is a trivial task in other languages and the only people who think immutable strings are a good idea are academics. The fact you have to write a class that is more complex than a doing the same task in Assembly language tells you "something is not right." If you have cases like multithreading then those can be handled as "special cases" but I have programmed multithread systems since my first job out of college 30 years ago and it was never an issue. – NoMoreZealots Feb 26 '20 at 02:25

8 Answers8

30

In Python mutable sequence type is bytearray see this link

quamrana
  • 37,849
  • 12
  • 53
  • 71
Jay M
  • 3,736
  • 1
  • 24
  • 33
  • I am not sure what @Marcin is referring to because bytearrays allows you to assign a new value to a slice of the bytearray. – jonathanrocher Mar 05 '14 at 23:57
  • @jonathanrocher Check edit history. Marcin pointed out an error, and it was corrected. – leewz Jul 02 '14 at 19:43
  • 1
    This should be the 'correct' answer. Too much messing about involved in the current top-voted. – robert Nov 17 '14 at 09:41
  • 29
    `bytearray` as the name obviously suggests is an array of bytes. Strings are not sequences of bytes but rather sequences of groups of bytes. I.e. this is only true for ASCII strings, not true for unicode in general. -1. – freakish Dec 16 '14 at 12:34
  • not really helpful if you want to use, say, the regex lib on a mutable. if you need mutable objects, switch your language to c++ or rust. python really, really, falls down when it comes to memory handling and mutability for projects that are sensitive to this sort of thing (like cryptography) – Erik Aronesty May 28 '19 at 14:33
  • 1
    Beware of multi-byte characters. Example: bytearray('aé'.encode('utf8')) bytearray(b'a\xc3\xa9') – Michael Grazebrook Mar 08 '21 at 01:30
  • @ErikAronesty You'd be reimplementing all of the `str` API. Which is a road to madness since that API is _huge_, and might grow or change with every Python version. It's much easier to have a buffer that you can assign to, and an easy way to take out strings, process them, and put the result back in. – toolforger Jan 14 '23 at 05:50
23

This will allow you to efficiently change characters in a string. Although you can't change the string length.

>>> import ctypes

>>> a = 'abcdefghijklmn'
>>> mutable = ctypes.create_string_buffer(a)
>>> mutable[5:10] = ''.join( reversed(list(mutable[5:10].upper())) )
>>> a = mutable.value
>>> print `a, type(a)`
('abcdeJIHGFklmn', <type 'str'>)
JohnMudd
  • 13,607
  • 2
  • 26
  • 24
  • 6
    **BE WARNED** that the buffer includes the terminator into its reported `len()`. **This will break slices with negative indices** unless you add an extra `-1` to each negative index. (For unicode buffers, it's `-1`, too, because `len` and slice indices for these types are in characters.) – ivan_pozdeev Jan 18 '18 at 16:32
  • Note: in python3 ctypes.create_string_buffer() takes bytes-type-argument as parameter, and ctypes.create_unicode_buffer() takes string-type-argument. – Rustam A. Jul 14 '21 at 11:53
15
class MutableString(object):
    def __init__(self, data):
        self.data = list(data)
    def __repr__(self):
        return "".join(self.data)
    def __setitem__(self, index, value):
        self.data[index] = value
    def __getitem__(self, index):
        if type(index) == slice:
            return "".join(self.data[index])
        return self.data[index]
    def __delitem__(self, index):
        del self.data[index]
    def __add__(self, other):
        self.data.extend(list(other))
    def __len__(self):
        return len(self.data)

... and so on, and so forth.

You could also subclass StringIO, buffer, or bytearray.

scottmrogowski
  • 2,063
  • 4
  • 23
  • 32
Joel Cornett
  • 24,192
  • 9
  • 66
  • 88
  • To be able to use regex and string methods like `find` you need to subclass from `str` instead of `object`. – chtenb Aug 20 '14 at 17:28
  • 1
    Correction: regex and `find` only work on the original string. Modifications made through `__setitem__`are disregarded. Is there a way to use regex on MutableStrings? – chtenb Aug 20 '14 at 17:35
  • You can do `re.match(expression, repr(mutable_string))` – Joel Cornett Aug 20 '14 at 17:46
  • 6
    But then you could as well use a normal string. I want/need to take advantage of the mutability. – chtenb Aug 20 '14 at 18:02
  • Too many functions to override. And you would have to check if there are any differences between the `str` API of the various Python versions. – toolforger Jan 14 '23 at 05:52
3

How about simply sub-classing list (the prime example for mutability in Python)?

class CharList(list):

    def __init__(self, s):
        list.__init__(self, s)

    @property
    def list(self):
        return list(self)

    @property
    def string(self):
        return "".join(self)

    def __setitem__(self, key, value):
        if isinstance(key, int) and len(value) != 1:
            cls = type(self).__name__
            raise ValueError("attempt to assign sequence of size {} to {} item of size 1".format(len(value), cls))
        super(CharList, self).__setitem__(key, value)

    def __str__(self):
        return self.string

    def __repr__(self):
        cls = type(self).__name__
        return "{}(\'{}\')".format(cls, self.string)

This only joins the list back to a string if you want to print it or actively ask for the string representation. Mutating and extending are trivial, and the user knows how to do it already since it's just a list.

Example usage:

s = "te_st"
c = CharList(s)
c[1:3] = "oa"
c += "er"
print c # prints "toaster"
print c.list # prints ['t', 'o', 'a', 's', 't', 'e', 'r']

The following is fixed, see update below.

There's one (solvable) caveat: There's no check (yet) that each element is indeed a character. It will at least fail printing for everything but strings. However, those can be joined and may cause weird situations like this: [see code example below]

With the custom __setitem__, assigning a string of length != 1 to a CharList item will raise a ValueError. Everything else can still be freely assigned but will raise a TypeError: sequence item n: expected string, X found when printing, due to the string.join() operation. If that's not good enough, further checks can be added easily (potentially also to __setslice__ or by switching the base class to collections.Sequence (performance might be different?!), cf. here)

s = "test"
c = CharList(s)
c[1] = "oa"
# with custom __setitem__ a ValueError is raised here!
# without custom __setitem__, we could go on:
c += "er"
print c # prints "toaster"
# this looks right until here, but:
print c.list # prints ['t', 'oa', 's', 't', 'e', 'r']
NichtJens
  • 1,709
  • 19
  • 27
2

Efficient mutable strings in Python are arrays. PY3 Example for unicode string using array.array from standard library:

>>> ua = array.array('u', 'teststring12')
>>> ua[-2:] = array.array('u', '345')
>>> ua
array('u', 'teststring345')
>>> re.search('string.*', ua.tounicode()).group()
'string345'

bytearray is predefined for bytes and is more automatic regarding conversion and compatibility.

You can also consider memoryview / buffer, numpy arrays, mmap and multiprocessing.shared_memory for certain cases.

kxr
  • 4,841
  • 1
  • 49
  • 32
2

The FIFOStr package in pypi supports pattern matching and mutable strings. This may or may not be exactly what is wanted but was created as part of a pattern parser for a serial port (the chars are added one char at a time from left or right - see docs). It is derived from deque.

from fifostr import FIFOStr

myString = FIFOStr("this is a test")
myString.head(4) == "this"  #true
myString[2] = 'u'
myString.head(4) == "thus"  #true

(full disclosure I'm the author of FIFOstr)

deftio
  • 51
  • 6
0

You could use a getter:

original_string = "hey all"
def get_string():
    return original_string

So when you need it you just call it like this:

get_string().split()
Knemay
  • 350
  • 1
  • 6
  • 12
-3

Just do this
string = "big"
string = list(string)
string[0] = string[0].upper()
string = "".join(string)
print(string)

'''OUTPUT'''
  > Big

  • OP points out in the question description that he's looking for something more efficient than `''.join(list)`. He's also asking specifically for a library which provides *mutable* strings in Python (or some other approach to attain it). – Please revise your answer and explain why doing it in this way is still worthwhile from your perspective. Also giving an explanation rather than *just do this* is much more helpful for future readers. See also the [contribution guide](https://stackoverflow.com/help/how-to-answer) for reference. – Ivo Mori Oct 20 '20 at 11:12