149

There is a buffer type in Python, but how can I use it?

In the Python documentation about buffer(), the description is:

buffer(object[, offset[, size]])

The object argument must be an object that supports the buffer call interface (such as strings, arrays, and buffers). A new buffer object will be created which references the object argument. The buffer object will be a slice from the beginning of object (or from the specified offset). The slice will extend to the end of object (or will have a length given by the size argument).

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
satoru
  • 31,822
  • 31
  • 91
  • 141

2 Answers2

160

An example usage:

>>> s = 'Hello world'
>>> t = buffer(s, 6, 5)
>>> t
<read-only buffer for 0x10064a4b0, size 5, offset 6 at 0x100634ab0>
>>> print t
world

The buffer in this case is a sub-string, starting at position 6 with length 5, and it doesn't take extra storage space - it references a slice of the string.

This isn't very useful for short strings like this, but it can be necessary when using large amounts of data. This example uses a mutable bytearray:

>>> s = bytearray(1000000)   # a million zeroed bytes
>>> t = buffer(s, 1)         # slice cuts off the first byte
>>> s[1] = 5                 # set the second element in s
>>> t[0]                     # which is now also the first element in t!
'\x05'

This can be very helpful if you want to have more than one view on the data and don't want to (or can't) hold multiple copies in memory.

Note that buffer has been replaced by the better named memoryview in Python 3, though you can use either in Python 2.7.

Note also that you can't implement a buffer interface for your own objects without delving into the C API, i.e. you can't do it in pure Python.

Scott Griffiths
  • 21,438
  • 8
  • 55
  • 85
  • 1
    Thanks for your explanation. But I still don't quite understand what's the difference between buffering and simple slicing. Using `s[6:11]` doesn't take extra storage space either, am I wrong? – satoru Aug 06 '10 at 11:31
  • 11
    In general a slice will take extra storage, so yes `s[6:11]` will be a copy. If you set `t = s[6:11]` and then `del s`, it frees the memory that was taken by `s`, proving that `t` was copied. (To see this you need a bigger `s` and track Python's memory usage). It is however much more efficient just to make the copy if there isn't much data involved. – Scott Griffiths Aug 06 '10 at 12:11
  • 1
    Thank you very mush :) BTW, could you please tell me what tool can I use to track Python's memory usage? – satoru Aug 06 '10 at 12:48
  • For memory usage see http://stackoverflow.com/questions/110259/ for example. Sometimes it's easiest just to watch Python's usage in Task Manager/Activity Monitor/top. – Scott Griffiths Aug 06 '10 at 17:15
  • 15
    For Python noobs like me: buffer is memoryview in Python 3 – Dirk Bester Aug 06 '12 at 08:05
  • To investigate how python does a slice and whether it makes a copy or not, you could do a dis.dis (using the dis module) on a function that does something like: s = "123123123" t = s[:3]. Then you will see the SLICE+2 operator, and further reading on it will give you the answer that it does actually make a copy. – SatA May 29 '13 at 05:28
  • @ScottGriffiths : Is there a way to make the data returned by buffer writeable without copying *(thus modyfing the original value of`s`in the example of your answer)* ? – user2284570 Jul 27 '16 at 20:51
  • @user2284570: The data in the buffer is writeable if the original object is mutable. So for strings you can't modify them, but in the second example I show how a buffer into a mutable `bytearray` can be used to modify the original. – Scott Griffiths Jul 27 '16 at 21:59
  • @ScottGriffiths : **If I understand you correctly you made an error then**. In the second example, you’re showing how the bytearray is writable not how the buffer is writable. – user2284570 Aug 05 '16 at 18:19
  • @user2284570: I should have said instead that modifying the mutable original modifies the buffer also - just try it in the Python interpreter. Your 'correction' causes an error when you try to modify the buffer so I've rolled it back. – Scott Griffiths Aug 06 '16 at 15:22
30

I think buffers are e.g. useful when interfacing Python to native libraries (Guido van Rossum explains buffer in this mailing list post).

For example, NumPy seems to use buffer for efficient data storage:

import numpy
a = numpy.ndarray(1000000)

The a.data is a:

<read-write buffer for 0x1d7b410, size 8000000, offset 0 at 0x1e353b0>
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Andre Holzner
  • 18,333
  • 6
  • 54
  • 63