how to generate pdf from xhtml page dynamically using itext+flying saucer with java

Question

I am using iText+flying saucer for the first time with xhtml pages using JSF 2.0 for simple registration form with regular input fields like firstName,lastName,phone number etc.Once user enters all the data and click on "NEXT" button I have to convert this XHTML page with User data into pdf.How can i exactly get the source HTML of this page with all the styles included in the page and convert it to pdf.Currently I am tying like this.

public void createPDF() {
    FacesContext facesContext = FacesContext.getCurrentInstance();
    ExternalContext externalContext = facesContext.getExternalContext();
    HttpSession session = (HttpSession) externalContext.getSession(true);
    String url = "http://localhost:8080/MyPROJECT/faces/page1.xhtml;JSESSIONID=" + session.getId();
    try {
    ITextRenderer renderer = new ITextRenderer();
    renderer.setDocument(url);
    renderer.layout();
    HttpServletResponse response = (HttpServletResponse) externalContext.getResponse();
    response.reset();
    response.setContentType("application/pdf");
    response.setHeader("Content-Disposition","C://user//first.pdf");
    OutputStream browserStream = response.getOutputStream();
    renderer.createPDF(browserStream);
    browserStream.close();
    session.invalidate();
    } catch (Exception ex) {
       ex.printStackTrace();
    }
    facesContext.responseComplete();
}

But it's throwing me this exception.

ERROR:  'The string "--" is not permitted within comments.'
org.xhtmlrenderer.util.XRRuntimeException: Can't load the XML resource (using TRaX transformer). org.xml.sax.SAXParseException: The string "--" is not permitted within comments.

Is this the right way to get my page using above URL.Does that URL get my page with User Data on click of NEXT Button and convert it pdf or am I trying with wrong code.Please help me. Examples appreciated.

score 1 · Accepted Answer · answered Oct 15 '12 at 06:06

1

This exceptions sounds more like a problem in the (x)html of your website. Is there something like  in your html?

Flying Saucer throws this exception because there's a -- somewhere in a comment block. Check this up and if possible try it without -- between .

However, since FS will fail on every little misstake in (X)HTML / XML (as noted in the readme), its often a good idea to use a HTML Cleaner before processing a website.

Here are two examples:

answered Oct 15 '12 at 06:06

ollo

24,797
14
106
155

Thanks for the quick answer.But do I get my XHTML page with user data in it if I use the url like above? How can print that HTMl doc which is input to PDF? Do u think my approach is correct – mdp Oct 15 '12 at 06:24
Hey I am confused here.Those above two API's you have mentioned are alternatives to flyingSaucer or Do I need to use along with iText+FS?If so, Can you please provide info on how to Integrate Jsoup/HTMLCleaner with iText+FS – mdp Oct 15 '12 at 16:50
I use them in combination with FS: **HTML** --> **JSoup or HtmlCleaner** --> **FlyingSaucer** --> **PDF**. Jsoup or HtmlCleaner will fix many mistakes in the input Html, which would kill FS otherwise. Both can escape Html entities - maybe this can fix your problem with the ``--`` in comments. – ollo Oct 15 '12 at 19:34
Hey I got a clear Idea Now.Thanks.One last question.Which one is best for cleaning purpose Jsoup or HtmlCleaner? – mdp Oct 15 '12 at 22:16
Best you test both. If you only want to "clean" your Html, HtmlCleaner is enough, but if you do further steps with it (like select a tag, change values etc) you better use Jsoup. But in general the result should be the same. – ollo Oct 16 '12 at 07:13
@ollo........my be silly question.How you got current view HTML into the backend/java code to give it as input to HtmlCleaner/jsoup?Please help me I got struck here. – mdp Oct 16 '12 at 17:42
@ollo......I am able to create a PDF but my HTML form fields like textbox,check boxes and radio buttons will be ignored? Did u face this issue? – mdp Oct 17 '12 at 17:18
Are they removed from HTML by a cleaner or ignored by FS? – ollo Oct 17 '12 at 21:41
Ignored By FS.HtmlCleaner is working very fine.I see all the HTML which is in the actual source even after using HTML Cleaner.So, I tried another library YaHp converter http://www.allcolor.org/YaHPConverter/.I am testing now.Are you able to make everything work with FS? – mdp Oct 17 '12 at 21:47
Just did a simple test with a checkbox: Yes, doesn't work. But maybe this thread can help you: http://stackoverflow.com/questions/6133581/html-to-pdf-using-itext-how-can-produce-a-checkbox – ollo Oct 17 '12 at 21:56
Oh Yeah. I already saw that post and able to get all the HTML elements in my PDF.I am cool now.But what might the problem with FS?.I have read some where they dont have AcroForm support. – mdp Oct 17 '12 at 21:58
I guess there's simply no form-tag defined. Eg. the class `com.lowagie.text.html.HtmlTags` has no definition for `form`. You could try checking out the source of FS and at the Tag there. – ollo Oct 17 '12 at 22:08

score 0 · Answer 2 · answered Jan 18 '13 at 15:51

The other thing you can do if you have some partial control over the HTML and only need to escape certain items is follow the example in the java.net article to swap out undesirable elements by replacing them in ContentCaptureServletResponse:

public String getContent(){
    writer.flush();
    String xhtmlContent = new String(contentBuffer.toByteArray());
    xhtmlContent = xhtmlContent.replaceAll("<thead>|</thead>","");
    return xhtmlContent; 
}

how to generate pdf from xhtml page dynamically using itext+flying saucer with java

2 Answers2