3

Python has several io base classes including

IOBase
RawIOBase
BufferedIOBase
TextIOBase

as well as several derived io classes:

FileIO
BytesIO

Now, when i create a BytesIO object, the mro is:

[<class '_io.BytesIO'>, <class '_io._BufferedIOBase'>, <class '_io._IOBase'>, <class 'object'>]

and when i create a FileIO object, the mro is:

[<class '_io.FileIO'>, <class '_io._RawIOBase'>, <class '_io._IOBase'>, <class 'object'>]

This is pretty straightforward.

However, when i open a binary file for writing using the built-in open, i get the mro:

[<class '_io.BufferedWriter'>, <class '_io._BufferedIOBase'>, <class '_io._IOBase'>, <class 'object'>]

Isn't a command which opens a file expected to return a FileIO object according to the principle of least surprise? I just wrote a method which accepts either a BytesIO or a file and i stumbled over my if isinstance(io.FileIO) ... clause. What is the difference between a FileIO object and the object returned by open?

1 Answers1

2

The main difference is that FileIO inherits from the RawIOBase class which provides low-level access to the OS-level API, but the open function return a BufferedIOBase inheritor, which is more suitable in generic case (I think)). As the side effect FileIO can work with the OS-level file descriptors as a name (open allows the path-like only). So the FileIO provides more flexible API to work with binary files or streams (for example to reduse the memory usage, etc), open - doesn't. More info about the difference here.

Figuratively speaking - open is a swiss knife for the files genrally, FileIO is a surgeon's scalpel.

So for your problem, may be the isinstance is not right choise, may be it's better to use "duck-typing" approach and check that the object has the necessary methods (or use another type to check with isinstance, for example IOBase if it covers your needs). Also you can use readable(), seekable(), writable() methods of the object to check the necesasary conditions

  • Thank you for your answer, esp pointing out the readable()... etc. methods. To the background: I have written a class with a `from_file(f: io.IOBase)` method. If it's given a file it determines the file size and load the content into an internal buffer. If it's given a BytesIO it uses getbuffer to get the contents and size and then proceeds like in the other case. According to your duck typing suggestion i would need to check for existance of `f.getbuffer()` or `f.name`. Another possibility would be to check for `BytesIO` and assume it's a `FileIO` else. – Wör Du Schnaffzig Nov 03 '20 at 09:18