1

I need to process an xml file and send it back, storing all in-memory. I tried to use BytesIO as a file-like object. Initially, I tried this:

with BytesIO() as file:
    data.write(file, encoding='windows-1251')
    return send_file(file,attachment_filename='output.xml',as_attachment=True)

Which resulted in the following error:

Traceback (most recent call last):
  File "/usr/lib/python3.7/site-packages/werkzeug/wsgi.py", line 580, in __next__
    data = self.file.read(self.buffer_size)
ValueError: I/O operation on closed file.

However, when I do so:

with BytesIO() as file:
    data.write(file, encoding='windows-1251')
    file.seek(0)
    return send_file(BytesIO(file.read()),attachment_filename='output.xml',as_attachment=True)

Everything works out fine. Can somebody explain to me what the problem with the first one is and why the second attempt works?

Alex
  • 11
  • 1
  • 5
  • The seek(0) makes the difference - but I would love to have someone to explain it. – Fips Jul 14 '21 at 09:57
  • @Fips here's an explanation: In the first case, data is written to the `BytesIO` object on line 2. When writing is done, the current position of the file cursor is at its end - therefore, attempting to read will return nothing. When that object is passed, as is, to `send_file` on line 3, the read attempt fails (since there's no more data to read). In the second case, however, the cursor is set back to 0 on line 3, with `seek(0)`. At that point, attempting to read data from the file object will return the previously written data, and so the `read()` on line 4 works. – Mr. 47 Dec 20 '22 at 12:23

2 Answers2

1

You're using with BytesIO() which means that you will work on it and BytesIO will stand open but your returnment is inside with in another words, you are trying send it still opens beacuse its inside with too.

On second situation you're creating another instance of BytesIO without using with what means it close the instance by itself.

Sorry by my english

0

Think of a file object as a page and a pencil. That pencil can point to any position on that page. When you open a file, the pen points to the beginning of the page (position 0), and the contents of the page are either empty (if it's a file open for output, or a new BytesIO object), or contain the contents of the file (if it's a file open for input, or a BytesIO object with preloaded content).

What happens when you write to or read from the file?

  • When you write something to the file, you simply start from the position where your pencil is resting right now, and write on the page from there. Assuming it's a new page, you start from the upper left corner (position 0), and go from there. When you're done writing, your pencil rests at the point where you stopped writing.
  • When you read from a file, you simply read from the page, starting from the point where your pencil is resting right now, and helping yourself by moving the pencil with every byte you read. You can read up to the point where the data on the page ends, at which point no further reading can be done (since the pencil points to the end of the data, and no additional data is present beyond that point).

So, what happens in the first scenario you described?

  1. You create a new BytesIO object (i.e., you get a shiny new page with a pencil pointing to the beginning).
  2. You write your data to the file (i.e., you use the pencil to write that data to the page, and when you're done, the pencil points to the end of that data).
  3. You attempt to read that data (not you directly, but the implementation of send_file) - but that attempt fails, since the pencil is at the end of the page, and there's nothing left to read.

However, in your second scenario you added a couple of changes:

  1. file.seek(0) moves the pencil to the beginning of the page, so the following read operation will succeed.
  2. Instead of passing file to send_file, you're passing BytesIO(file.read()). First, you read the file with file.read() - that succeeds, thanks to the file.seek(0) above. Next, you take the data that you just read and pass it to the BytesIO constructor - so it generates a new page, with that content preloaded on the page, with the pencil still pointing to its beginning (so reading will succeed).

In fact, of the above 2 changes, only the first one is required - the 2nd change is redundant and, in fact, reduces performance since the data is read and written twice. The correct approach would be:

with BytesIO() as file:
    data.write(file, encoding='windows-1251')
    file.seek(0)
    return send_file(file, attachment_filename='output.xml', as_attachment=True)
Mr. 47
  • 76
  • 4