In Python 3, I can get the size of a ByteIO object via object.getbuffer().nbytes
(where object = ByteIO()
), but what would be the best equivalent for getbuffer()
in Python 2? Doing some exploring, I found out I can use len(object.getvalue())
or sys.getsizeof(object)
, but I don't know if Python 2 will accept them.

- 305
- 4
- 14
-
Note, this is **not the size of the `BytesIO` object**, it is the *number of bytes of the underlying buffer*. But, why have you simply not tried if `len(object.getvalue())` works in Python 2 or not? – juanpa.arrivillaga Aug 14 '17 at 19:07
-
It does, but I'm not sure if it'll output reliably the same result as `getbuffer().nbytes` would – Brian Lee Aug 14 '17 at 20:08
-
It will for `io.BytesIO` objects. – juanpa.arrivillaga Aug 14 '17 at 20:11
-
Also, `sys.getsizeof(object)` will **not be equivalent**. – juanpa.arrivillaga Aug 14 '17 at 20:12
3 Answers
see critical update below
After digging in python 2.7 source code I found a simple solution: because io.BytesIO()
returns a file descriptor, it has a standard set of functions including tell()
.
Note that indirect methods such as len(fd.getvalue())
or fd.getbuffer().nbytes
copy buffer out and then compute buffer size. In my case, when the buffer holds 1/2 of the memory, this ends up as an application crash :/
Contrary fd.tell()
just reports a current position of the descriptor and do not need any memory allocation!
Note that both sys.getsizeof(fd)
, fd.__sizeof__()
do not return correct bufer size.
>>> from io import BytesIO
>>> from sys import getsizeof
>>> with BytesIO() as fd:
... for x in xrange(200):
... fd.write(" ")
... print fd.tell(), fd.__sizeof__(), getsizeof(fd)
1 66 98
2 66 98
3 68 100
4 68 100
5 70 102
6 70 102
.....
194 265 297
195 265 297
196 265 297
197 265 297
198 265 297
199 265 297
200 265 297
UPDATE
After @admaster and @Artemis comments I realized that the correct method, in case of preset buffer, is to move the pointer to the end of the buffer. Standard seek
function can do that, ant it will report the current buffer size
buffsize = fd.seek(0,2)
So here how it should be done without unnecessary coping memory
from io import BytesIO
x = BytesIO(b'AAAAAA')
x.tell() # returns 0
x.seek(0,2) # returns 6
# However
x = BytesIO()
x.write(b'AAAAAA')
x.tell() # returns 6
x.seek(0,2) # returns 6

- 2,946
- 1
- 22
- 27
-
if the Byteio object is initialized from another buffer, `tell()` will return 0 (e.g. `BytesIO(b'00').tell()`), so `getvalue()` is more reliable. – eadmaster Jan 28 '20 at 21:26
-
@eadmaster you should report this bug! `tell()` **must** report the current size of written data and if data was borrowed from another buffer, it must report the size of the data in that buffer. – rth Jan 29 '20 at 17:43
-
@rth as another answer notes, `tell()` does not report "the current size of written data", it reports the current position of the pointer. This will *only* be the current size of data *if* the pointer is currently at the end of the file. – Artemis Feb 19 '21 at 12:49
You can use getvalue()
Example:
from io import BytesIO
if __name__ == "__main__":
out = BytesIO()
out.write(b"test\0")
print len(out.getvalue())
See: https://docs.python.org/2/library/io.html#io.BytesIO.getvalue
Its worth noting that tell() will only return you the current position of the file descriptor and not necessarily the size of the buffer.
This can be seen in the following example:
from io import BytesIO
x = BytesIO(b'AAAAAA')
x.tell() # returns 0
x.read()
x.tell() # Now it returns 6
# However
x = BytesIO()
x.write(b'AAAAAA')
x.tell() # returns 6
In the first example we have initialised the object with our byte string but the file descriptor is still at the beginning hence returning 0, we then read the stream which means our file descriptor will have moved to the end, as a result it returns 6.
In the second example we initialise an empty BytesIO object and write our bytes string to it, as a result the file descriptor is now at the end of the stream and so we don't need to read it to update the descriptor.

- 11
- 1