I am talking about the HTTP errors, for instance "404 Not Found". I read the documentation but found nothing that could help me.
Asked
Active
Viewed 2,353 times
2
-
Could you add some small code snippet, and additional context on how you expect to "handle" a not found with the feedparser module in your usage? Do you want to log the error, do you want to "autocorrect / guess" a similar url (i hope not ;-), ... want more "elegance in the process (catching of exceptional conditions etc.) ? – Dilettant Jun 14 '16 at 13:20
1 Answers
2
Feedparser returns the HTTP status code in the status
attribute (as documented at https://pythonhosted.org/feedparser/reference-status.html) which you can check and then handle however you need to:
>>> import feedparser
>>> nonfeed = feedparser.parse('http://example.com/notafeed')
>>> nonfeed.status
404
>>> feed = feedparser.parse('http://stackoverflow.com/feeds/')
>>> feed.status
200
See also the documentation on "HTTP Redirects". All HTTP headers are returned in the headers
attribute, which can be useful for error reporting.
Even if there are no HTTP errors, there might be some parsing errors. While feedparser is very liberal in what it accepts, it does set a bozo flag if it encounters a malformed feed (and puts a description of the error in bozo_exception
):
>>> feed.bozo
False
>>> nonfeed.bozo
1
>>> nonfeed.bozo_exception
SAXParseException('syntax error',)

cristoper
- 441
- 1
- 6
- 9