I have a Flask app that retrieves an XML document from a url and processes it. I'm using requests_cache with redis to avoid extra requests and ElementTree.iterparse to iterate over the streamed content. Here's an example of my code (same result occurs from both the development server and the interactive interpreter):
>>> import requests, requests_cache
>>> import xml.etree.ElementTree as ET
>>> requests_cache.install_cache('test', backend='redis', expire_after=300)
>>> url = 'http://myanimelist.net/malappinfo.php?u=doomcat55&status=all&type=anime'
>>> response = requests.get(url, stream=True)
>>> for event, node in ET.iterparse(response.raw):
... print(node.tag)
Running the above code once throws a ParseError:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/xml/etree/ElementTree.py", line 1301, in __next__
self._root = self._parser._close_and_return_root()
File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/xml/etree/ElementTree.py", line 1236, in _close_and_return_root
root = self._parser.close()
xml.etree.ElementTree.ParseError: no element found: line 1, column 0
However, running the exact same code again before the cache expires actually prints the expected result! How come the XML parsing fails the first time only, and how can I fix it?
Edit: If it's helpful, I've noticed that running the same code without the cache results in a different ParseError:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/xml/etree/ElementTree.py", line 1289, in __next__
for event in self._parser.read_events():
File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/xml/etree/ElementTree.py", line 1272, in read_events
raise event
File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/xml/etree/ElementTree.py", line 1230, in feed
self._parser.feed(data)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0