I've been trying to scrape some online stuff using JTidy, but I got this annoying error and I have no idea how to fix it or get JTidy to ignore it:
InputStream: Doctype given is "-//W3C//DTD XHTML 1.0 Transitional//EN"
InputStream: Document content looks like XHTML 1.0 Transitional
630 warnings, 1 error were found!
This document has errors that must be fixed before
using HTML Tidy to generate a tidied up version.
It seems like a silly error - and there are no other errors, so this seems to be the one blocking JTidy from parsing the document. I'm parsing it from an InputStream directly off a HttpURLConnection, and I'm using the method Tidy.parseDom.