0

Currently I have an MVC Core v2.0.0 application that uses the .NET 4.7 framework. Inside of this I'm using iTextSharp to try and convert HTML to PDF. If I add an image to the html I get the following exception "The page 1 was requested but the document has only 0 pages".

I have tried using both a full URL and a base64 encoded content.

<img src=\"http://localhost:4808/images/sig1.png\">

<img src=\"data:application/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAIAAAD8GO2jAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAAEnQAABJ0Ad5mH3gAAADsSURBVEhLtc2NbcMgAERhz9JtImXK1gtkq46RU58K9CX&#x2B;KWDpswLhdLd83dZi&#x2B;fyerx24ZEMD4cQgtcOhEfnUjj&#x2B;hEfyoHTU0opzUjvLar72oHW2gh&#x2B;5qhzL/4/v0Dd9/qB3KnOX7L7VDmVN8b6gdyuDjcZf6Wk/vqB08qXHLwUCoHWrZcTwQaoeKthwPkFM7SsuOvQFF1Q5lXm0OKAe1QxlZ6mm3ulA7lGnVgfPUDmWKnoFQO5RB50CoHcpE/0CoHcoMDYTa0QZGB0LtKK8TBkLt4GnOQKgd&#x2B;X/aQKgdMwdC7TF5IC4fiDpwW590JX1NuZQyGwAAAABJRU5ErkJggg==\" alt=\"test.png\">

Here is the helper method that does the transform

public static Stream GeneratePDF(string html, string css = null)
{
    MemoryStream ms = new MemoryStream();

    //HttpRenerer.PdfSharp implemenation
    //PdfSharp.Pdf.PdfDocument pdf =
    //    TheArtOfDev.HtmlRenderer.PdfSharp.PdfGenerator.GeneratePdf(html, PdfSharp.PageSize.Letter, 40);
    //pdf.Save(ms, false);

    try
    {
        //iTextSharp implementation
        //Create an iTextSharp Document which is an abstraction of a PDF but **NOT** a PDF
        using (Document doc = new Document(PageSize.LETTER))
        {
            //Create a writer that's bound to our PDF abstraction and our stream
            using (PdfWriter writer = PdfWriter.GetInstance(doc, ms))
            {
                writer.CloseStream = false;

                if (string.IsNullOrEmpty(css))
                {
                    //XMLWorker also reads from a TextReader and not directly from a string
                    using (StringReader srHtml = new StringReader(html))
                    {
                        doc.Open();
                        iTextSharp.tool.xml.XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, srHtml);
                        doc.Close();
                    }
                }
                else
                {
                    using (MemoryStream msCss = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(css)))
                    using (MemoryStream msHtml = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(html)))
                    {
                        doc.Open();
                        iTextSharp.tool.xml.XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, msHtml, msCss);
                        doc.Close();
                    }
                }
            }
        }
    }
    catch (Exception ex)
    {
        ms.Dispose();
        throw ex;
    }

    //I think this is needed to use the stream to generate a file
    ms.Position = 0;
    return ms;
}

Here I'm calling my helper method to generate a static PDF document.

public async Task<IActionResult> TestGenerateStaticPdf()
{
    //Our sample HTML and CSS
    example_html = "<!doctype html><head></head><body><h1>Test Report</h1><p>Printed: 2017-09-29</p><table><tbody><tr><th>User Details</th><th>Date</th><th>Image</th></tr><tr><td>John Doe</td><td>2017-09-29</td><td><img src=\"http://localhost:4808/images/sig1.png\"></td></tr></tbody></table></body>";
    example_css = "h1 {color:red;} img {max-height:180px;width:100%;page-break-inside:avoid;} table {border-collapse:collapse;width:100%;} table, th, td {border:1px solid black;padding:5px;page-break-inside:avoid;}";

    System.IO.Stream stream = ControllerHelper.GeneratePDF(example_html, example_css);

    return File(stream, "application/pdf", "Static Pdf.pdf");
}
Swazimodo
  • 1,147
  • 1
  • 15
  • 34

1 Answers1

0

It turns out that iTextSharp does not honor the doctype and instead forces XHTML. According to XHTML, the image tag needs to be closed. If you use it seems to generate without the exception. However, I was not able to get the base64 encoded content to render. There is no error in this case but the image does not show up.

Swazimodo
  • 1,147
  • 1
  • 15
  • 34
  • Read [the introduction to the pdfHTML tutorial](https://developers.itextpdf.com/content/itext-7-converting-html-pdf-pdfhtml). You are using old technology to convert HTML to PDF. iText 7 + the pdfHTML add-on supports tags that aren't closed, and it also supports base64 images out-of-the-box. See the FAQ entry [Can pdfHTML render Base64 images to PDF?](https://developers.itextpdf.com/content/itext-7-converting-html-pdf-pdfhtml/chapter-7-frequently-asked-questions-about-pdfhtml/can-pdfhtml-render-base64-images-pdf) Upgrade, and your problems are solved. – Bruno Lowagie Sep 30 '17 at 01:32
  • I added this through nuget isn't that a supported distribution method? I would have thought it would use the latest. – Swazimodo Oct 02 '17 at 18:57