I am scraping html
using HtmlUnit but the html
is malformed with few tags as unclosed and thus HtmlUnit is giving wrong results.So I need to clean it before passing it to HtmlUnit.
How can I do that.
A short code snippet or tutorial would be appreciated