0

I have an MVC application that is uploading a PDF file and rendering each page as single PNG image using Magick.NET. The conversion is fine in most cases, but in some I am getting a blank image where text should be and other lines of text displaying correctly in the same image. Does anyone know what could be causing this?

Below is the code I'm using.

public FileResult PNGPreview(Guid id, Int32 index)
{
    MagickReadSettings settings = new MagickReadSettings();
    // Settings the density to 300 dpi will create an image with a better quality
    settings.FrameIndex = index;
    settings.FrameCount = 1;
    settings.Density = new PointD(300, 300);
    settings.UseMonochrome = true;
    using (MagickImageCollection images = new MagickImageCollection())
    {
        // Add all the pages of the pdf file to the collection
        images.Read(CreateDocument(id), settings);

        using (MemoryStream stream = new MemoryStream())
        {

            images[0].Write(stream, MagickFormat.Png24);
            stream.Close();
            byte[] result = stream.ToArray();
            return File(result, "image/png");
        }
    }
}

private byte[] CreateDocument(Guid id)
{
    PdfReader reader = new PdfReader(Server.MapPath(String.Format("~/documenttemplates/{0}.pdf", id)));
    byte[] result = null;
    using (MemoryStream ms = new MemoryStream())
    {
        PdfStamper stamper = new PdfStamper(reader, ms, '\0', false);
        stamper.Close();
        reader.Close();
        result = ms.ToArray();
    }

    return result;
}
Sir l33tname
  • 4,026
  • 6
  • 38
  • 49
Steve
  • 2,988
  • 2
  • 30
  • 47
  • Is the problem random, or do some PDF files consistently convert to blank images? – Micke Sep 07 '15 at 11:01
  • Some PDF files consistently convert. I thought at first it may be a font issue but the PDFs have standard fonts like Helvetica, Arial etc. – Steve Sep 07 '15 at 11:07
  • I think it would be helpful if you could share one of the PDF files that consistently convert to blank images, if any. – Micke Sep 07 '15 at 11:16
  • Magick.NET uses Ghostscript to read the PDF files. This might be a bug in Ghostscript. Are you using the latest version? – dlemstra Sep 07 '15 at 12:02
  • I am using GhostScript 9.16 which is the current version. – Steve Sep 07 '15 at 12:30
  • Can you add a link to your PDF file? Feel free to contact me (I wrote Magick.NET) on CodePlex if you don't want to publicly share your pdf file. – dlemstra Sep 07 '15 at 13:30
  • @dlemstra I have sent you an email via CodePlex. Thanks. – Steve Sep 07 '15 at 13:39

1 Answers1

1

The PDF file that caused this issue was provided to me by e-mail and I was told that this file was created with Word and then edited with Foxit Pro.

Magick.NET uses Ghostscript to convert the PDF file to an image. A command similar to the one below is executed.

"c:\Program Files (x86)\gs\gs9.16\bin\gswin32c.exe" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE
-dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 -sDEVICE=pnggray"
-dTextAlphaBits=4 -dGraphicsAlphaBits=4 "-r72x72"  "-sOutputFile=Test.%d.png" "-fTest.pdf"

And that will tell us that the file that was created is corrupt.

**** Error reading a content stream. The page may be incomplete.
**** File did not complete the page properly and may be damaged.
**** Error reading a content stream. The page may be incomplete.
**** File did not complete the page properly and may be damaged.

**** This file had errors that were repaired or ignored.
**** The file was produced by:
**** >>>> Microsoft? Word 2013 <<<<
**** Please notify the author of the software that produced this
**** file that it does not conform to Adobe's published PDF
**** specification.

This can be solved by creating the input file with a different program.

dlemstra
  • 7,813
  • 2
  • 27
  • 43
  • Saving the file in Word 2013 to PDF was the cause of the problem. Using another method to convert from Word to PDF solved this issue. Thanks for your help. – Steve Sep 07 '15 at 15:15