Why is my PDF being generated as blank?

Question

I am using ItextSharp and c#, asp.net MVC to generate a PDF report. However, when I generate the report the PDF comes back as blank (apart from a header which is working fine). I would love your input.

The code that generates the report is as follows:

                using (var writer = PdfWriter.GetInstance(doc, ms))
                {
                    // This sorts out the Header and Footer parts.
                    var headerFooter = new ReportHeaderFooter(reportsAccessor);
                    writer.PageEvent = headerFooter;

                    var rootPath = ConfigurationManager.AppSettings["SaveFileRootPath"];
                    string html = File.ReadAllText(rootPath + "/reports/report.html");                      

                    // Perform the parsing to PDF
                    doc.Open();

                    // The html needs to be sorted before this call.
                    StringReader sr = new StringReader(html);
                    XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, sr);
                    doc.Close();
                    writer.Close();
                    res = ms.ToArray();
                }

I know there is a lot hidden, but for arguments sake an example of the HTML that this generates and puts in the StringReader is here:

    <table style="font-size:10pt">
        <tr>
            <td rowspan="4"> Address<br/> </td>
            <td>Phone: phone</td>
        </tr>
        <tr>
            <td>Fax: fax</td>
        </tr>
        <tr>
            <td>Email: email@example.com</td>
        </tr>
        <tr>
            <td>Website: example.com</td>
        </tr>
    <table>

    <table style="font-size:10pt; width:100%">
        <tr style="width:50%">
            <td>Settlement: 30 days from invoice</td>
        </tr>
        <tr>
            <td>Delivery Charge Notes: N/A</td>
        </tr>
    </table>
    <p style="width:100%; font-size:10pt">
    I love notes</p>

    <table>
    <tr style="font-weight:bold">
        <td>Item</td>
        <td>Item Code</td>
        <td>Description</td>
        <td>Unit Size</td>
        <td>Units Per Outer</td>
        <td>Units Per Pallet</td>
        <td>Invoice Price Volume Breaks</td>
        <td>Branding</td>
        <td>Notes</td>
    </tr>

    </table>

However, this html generates a nice blank PDF file (not what I want). I can't see what might be wrong with this and would love some input in to two things:

[1] Why the report is blank [2] If this is an error in the parsing, why it doesn't come up with an error and instead a blank report (is there a setting or argument I can set that will throw errors which are much more useful than blank reports?)

UPDATE: I have added the code for the header/footer

public string HeaderTitle { get; set; } public IReportsAccessor ReportsAccessor { get; set; } public ReportHeaderFooter(IReportsAccessor reportsAccessor) { this.ReportsAccessor = reportsAccessor; }

    public override void OnStartPage(PdfWriter writer, Document document)
    {
        base.OnStartPage(writer, document);
        var rootPath = ConfigurationManager.AppSettings["SaveFileRootPath"];
        GetReportImageResult imgInfo = ReportsAccessor.GetImage(4);

        byte[] header_img = imgInfo.ReportImage;
        string logoFn = rootPath + "/tmp/logo.png";
        File.WriteAllBytes(logoFn, header_img);
        string html = File.ReadAllText(rootPath + "/reports/report_header.html");
        html = html.Replace("{{ title }}", HeaderTitle);
        html = html.Replace("{{ logo_img }}", logoFn);

        StringReader sr = new StringReader(html);
        XMLWorkerHelper.GetInstance().ParseXHtml(writer, document, sr);          
    }

Have you tried with a really simple HTML? The fragment you show below contains at least one HTML error I can see already (look at the first table - you don't have a closing tag for it (you wrote "" instead of "
". It would be interesting to know whether a really simple and valid HTML works; that would mean that the error is with your particular HTML file... — David van Driessche, May 16 '15 at 16:24
@DavidvanDriessche - tested and confirmed an Exception is thrown when passing an invalid HTML fragment. — kuujinbo, May 16 '15 at 16:54

Bruno Lowagie · Accepted Answer · 2015-05-26T12:13:48.563

3

When I run your HTML through my code, I do get an exception, because there's one sneaky error in your HTML. The first table doesn't have a closing tag: it says <table> instead of </table>. That's easily overlooked. You can find the corrected HTML here: table10.html

This is what the resulting PDF looks like: html_table_10.pdf

enter image description here

One major difference between your code and mine, is that I used iText / XML Worker 5.5.6 (Java) and you used iTextSharp / XML Worker (C#). While the Java iText and XML Worker are kept in sync with the C# iTextSharp and XML Worker (and should therefore have identical behavior), the underlying XML parser is different.

I know that we decided to behave like browsers (*), silently allowing some bad HTML constructs to fail, but my experience with the Java XML parsing is that is throws exceptions. Maybe the C# XML parsing doesn't.

(*) I'm the CEO of the iText Group, consisting of companies in Belgium, the US and Singapore.

As for the "css resolver crying" exception, we'll remove it in the next release: enter image description here

edited May 26 '15 at 12:13

answered May 16 '15 at 16:49

Bruno Lowagie

75,994
9
109
165

I think this is the answer, I am getting another error with the inline styles I think (it says something about the css resolver crying?!?) but I believe this has sorted the original problem. Thanks – Daniel Casserly May 16 '15 at 18:38
When I was younger, I often put funny error messages in my code, but I didn't write the XML Worker code, so I've never heard about a crying CSS resolver. I wonder how that sounds :D If you have the full error message, I can check with the XML Worker developers. – Bruno Lowagie May 16 '15 at 18:44
error message is {"iTextSharp.tool.xml.pipeline.css.CssResolverPipeline cries, it cannot find it's own context."} – Daniel Casserly May 16 '15 at 20:31
I'll ask our developers about this error message. It's kind of funny, but in all honesty: I wouldn't know what it means just by reading it. – Bruno Lowagie May 17 '15 at 11:38
Thanks Bruno, not so funny on the receiving end of such a message I can tell you :-S – Daniel Casserly May 17 '15 at 11:45
I understand. That's why I created a ticket on our issue tracker asking our developers what this is about. – Bruno Lowagie May 17 '15 at 11:48
Do you have a link to the issue tracker? I haven't heard anything back from this. – Daniel Casserly May 26 '15 at 10:40
See the screen shot I've just added to my answer. You didn't configure XML Worker correctly and as a result, the context is missing. We couldn't reproduce the problem, so apart form changing the message in the exception, there is nothing to fix. – Bruno Lowagie May 26 '15 at 12:16
Sorry I don't understand, you have all the code for the XMLWorker that I am using, could you elaborate on what wasn't configured correctly please? – Daniel Casserly May 26 '15 at 15:55
You can find my test here: http://itextpdf.com/sandbox/xmlworker/ParseHtmlTable10 and the result can be found here: http://itextpdf.com/sites/default/files/html_table_10.pdf I didn't encounter any problem. Nothing is broken, hence there is nothing to fix. – Bruno Lowagie May 26 '15 at 16:04
The only difference being that you are not setting a header/footer using the var headerFooter = new ReportHeaderFooter(reportsAccessor); that I have in my example. I will attach that code to my example just now. – Daniel Casserly May 26 '15 at 16:09
Say no more: you are cursing in the church! If you'd read the documentation, you'd know that **it is forbidden** to add content to a document in the `OnStartPage()` method! You are using the `document` parameter thinking it is an ordinary `document`. It isn't! It's a `PdfDocument` instance. It is very normal that things go wrong. – Bruno Lowagie May 26 '15 at 16:18
This explains that you can't use the `OnStartPage()` method to add content, but also why it's a bad idea to add the bytes logo to the PDF *as many times as there are pages*: http://stackoverflow.com/questions/12942133/how-to-add-an-image-to-my-header-in-itext-generated-pdf (which is exactly what you're trying to do: you're trying to create a bloated PDF...) – Bruno Lowagie May 26 '15 at 16:22
I tried to use the onEndPage() but I cannot add a header in HTML as none of the CSS styles that would allow it work on the limited set that is allowed. It does add a header and works, and even if I put a simple paragraph below it works fine, however, if I add the code above in it throws an error. Doesn't make sense. – Daniel Casserly May 26 '15 at 16:25
If your header is identical for every page, why would you parse the HTML over and over again? Why would you add the bytes of the same logo multiple times to the document? I would never design my application like that. I'd create a `PdfTemplate` for my header and I'd add that header in the `OnEndPage()` method. I can't make sense of what you're doing. Please reconsider your design. – Bruno Lowagie May 26 '15 at 16:31
Ok so this is a proof of concept and I understand your concerns on my reparsing, however, the onendpage() routine adds things to the end of the page. To get things to the top of the page you have to absolutely place it which with HTML is impossible, the OnStartPage does work (as it runs presumably at the start of the page) is there an example of using the OnEndPage to add images to the TOP of a PDF in Java or C# as I couldn't find one with XML worker – Daniel Casserly May 27 '15 at 21:02
Your allegation "the `OnEndPage()` routine adds things to the end of the page" is completely wrong and reveals a deep lack of understanding of the tool you are using. Please read the section about page events in the [documentation](http://pages.itextpdf.com/ebook-stackoverflow-questions.html). – Bruno Lowagie May 28 '15 at 06:13
Sorry to be a pest, I have reformatted the code and still getting issues. I have asked another question on SO. If you wouldn't mind having a look that would be great. http://stackoverflow.com/questions/30602850/getting-a-stack-overflow-exception-generating-a-pdf – Daniel Casserly Jun 02 '15 at 17:34
[Copy/Pasted reply:] I was in Singapore for the Communicasia conference the whole week. If you have an additional question, hence I can't answer your question today. – Bruno Lowagie Jun 06 '15 at 10:20

Why is my PDF being generated as blank?

1 Answers1

Linked