-1

I am trying to read a .html file by DOM parser but it gives me following exception while parsing.

[Fatal Error] form3.html:559:133: The element type "font" must be terminated by the matching end-tag "</font>".
org.xml.sax.SAXParseException; systemId: file:/home/puja/Dnyaneshwar/WCD_14_02_17/FileConverter/resources/form3.html; lineNumber: 559; columnNumber: 133; The element type "font" must be terminated by the matching end-tag "</font>".
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205)
    at DomConverter.main(DomConverter.java:25)
Kainix
  • 1,186
  • 3
  • 21
  • 33
Dnyanesh
  • 79
  • 2
  • 10
  • 1
    the error is clear you have an open `` tag without closed `` try to fix it – Youcef LAIDANI Feb 21 '17 at 09:01
  • Actually, I converted doc file to an HTML by Libre office. Here I solved lots of issue like above by manually editing the file but that file is so much big. So can we disable such checking? – Dnyanesh Feb 21 '17 at 09:05

1 Answers1

1

You don't use an XML parser to parse an HTML document not even an xhtml document.

You can use an html parser like jsoup.

minus
  • 2,646
  • 15
  • 18