How do I build new html from the parsed tagnodes generated by htmlparser in java?

Question

I want to write a java code which converts .html to pdf.I used adobe's itext api for html to pdf conversion.However this conversion fails when i give bad html file as input.(Html tags are not properly ended)Hence i used Htmlcleaner parser which cleans the bad html but not able to get the code which can rebuild the new html .Does anyone know about how to build new html from the parsed html tagnodes?

score 0 · Answer 1 · answered Oct 10 '15 at 14:46

HtmlCleaner comes with a set of serializers that you can use for instance like this:

    final HtmlCleaner cleaner = new HtmlCleaner();
    final CleanerProperties properties = cleaner.getProperties();
    final Serializer serializer = new SimpleHtmlSerializer(properties);

    TagNode node = cleaner.clean("hello world");
    StringWriter writer = new StringWriter();
    serializer.write(node, writer, "UTF-8");

    System.out.println(writer.toString());

How do I build new html from the parsed tagnodes generated by htmlparser in java?

1 Answers1