Am using itext to create pdf from html content. I build html content in the form of table using java String buffer. A Map contains metadata values of the files in the form of key value pairs. I iterate these key and values to build the html table. The problem is some of the metadata values in map are meaningless/invalid symbols. So pdf creation fails with following exception.
java.io.IOException: Expected > for tag: <{1}/> near line 1, column 717
at com.lowagie.text.xml.simpleparser.SimpleXMLParser.throwException(SimpleXMLParser.java:568)
at com.lowagie.text.xml.simpleparser.SimpleXMLParser.go(SimpleXMLParser.java:331)
at com.lowagie.text.xml.simpleparser.SimpleXMLParser.parse(SimpleXMLParser.java:579)
at com.lowagie.text.html.simpleparser.HTMLWorker.parse(HTMLWorker.java:141)
Content which caused the exception is
“$é6莚ÆuCÅ ©À SÀF;r 1Ì/XQ‡,Ô<ÒÐ"‡(¢ËÄòÅ1¡Ø€ÌÅc
So my question is what are these characters(Non-Ascii,utf-unsupported)? Is there any way to identify and skip them while building html?