0

I've created 2 form-fillable pdf's, one to be used as a customer order form and the other to be used in-house as a production sheet. Each of the pdf's has identical fields (same name and type of field for each). I've written an app that (among several other things) uses iTextSharp to read all of the fields in a given customer order form, creates a new production sheet, and fills in all of the data from the order form. This all works smoothly for the text and date fields (string data). However, there is one image field on each pdf and I need to take the image from the image field on the order form and copy it to the image field on the production sheet. This is where I'm getting hung up.

I can use pr.Acrofields.GetFieldItem("imageFieldName"); to get the image as an Acrofields.item object, but I can't seem to get iTextSharp to let me put that into an image field using something like the PdfStamper.Acrofields.SetField() method, since it will only take a string.

Is there perhaps a way to take that image data and store it as a temporary .jpg or .bmp file, then insert that into the production sheet's image field? Or am I going about this all wrong?

Derek Glissman
  • 67
  • 1
  • 1
  • 14
  • The pdf format does not have any image fields. Some pdf designers allow to *emulate* them using e.g. a button plus some javascript. But as the field is merely *emulated*, there is no image value. Nonetheless chances are that copying the image data is possible. Can you share example pdfs to check an implementation? – mkl Jun 26 '18 at 19:21
  • Here is a link to the 2 pdf files that I am using: https://drive.google.com/open?id=1QAbCoMcMsYoZDbqvuaxaLYZ4nrqmXnkc – Derek Glissman Jun 27 '18 at 16:07
  • I reckon it would be easier to convert the pdf to a bitmap, then you can cut specific coordinates out of the image and use them. ".net pdf to bmp" is a bit of a mire though, lots of libraries with Free* written next to them :P – Davesoft Jun 28 '18 at 09:59
  • @DerekGlissman I don't have permission to access those files. – mkl Jun 28 '18 at 12:46
  • @mkl Sorry; I had it shared internally only. It's publicly available now. – Derek Glissman Jun 28 '18 at 19:02

1 Answers1

2

As already said in a comment, the pdf format does not have any image fields. Some pdf designers allow to emulate them using e.g. a button plus some javascript. But as the field is merely emulated, there is no image value. This is indeed the case for your two documents.

To retrieve the image from the source form button, therefore, we cannot take the button value but instead have to extract the image from the button appearance. We do this using the itext parser namespace classes with a custom ImageRenderListener render listener class collecting bitmap images.

To set the image to the target form button, furthermore, we also cannot simply set the button value but have to set the button appearance. We do this using the iText AcroFields methods GetNewPushbuttonFromField and ReplacePushbuttonField.

The ImageRenderListener render listener class

All this render listener does is collect bitmap images:

public class ImageRenderListener : IRenderListener
{
    public List<System.Drawing.Image> Images = new List<System.Drawing.Image>();

    public void BeginTextBlock()
    { }

    public void EndTextBlock()
    { }

    public void RenderText(TextRenderInfo renderInfo)
    { }

    public void RenderImage(ImageRenderInfo renderInfo)
    {
        PdfImageObject imageObject = renderInfo.GetImage();
        if (imageObject == null)
        {
            Console.WriteLine("Image {0} could not be read.", renderInfo.GetRef().Number);
        }
        else
        {
            Images.Add(imageObject.GetDrawingImage());
        }
    }
}

A Copy method for the image

This method retrieves the first image from the source reader form element and adds it to the target stamper form element:

void Copy(PdfReader source, string sourceButton, PdfStamper target, string targetButton)
{
    PdfStream xObject = (PdfStream) PdfReader.GetPdfObjectRelease(source.AcroFields.GetNormalAppearance(sourceButton));

    PdfDictionary resources = xObject.GetAsDict(PdfName.RESOURCES);
    ImageRenderListener strategy = new ImageRenderListener();
    PdfContentStreamProcessor processor = new PdfContentStreamProcessor(strategy);
    processor.ProcessContent(ContentByteUtils.GetContentBytesFromContentObject(xObject), resources);
    System.Drawing.Image drawingImage = strategy.Images.First();
    Image image = Image.GetInstance(drawingImage, drawingImage.RawFormat);

    PushbuttonField button = target.AcroFields.GetNewPushbuttonFromField(targetButton);
    button.Image = image;
    target.AcroFields.ReplacePushbuttonField(targetButton, button.Field);
}

An example

I filled an image into the source document using Adobe Acrobat Reader

Screen shot customer order form

and saved this document as Customer Order Form-Willi.pdf.

Then I applied the above copy method:

String source = @"Customer Order Form-Willi.pdf";
String dest = @"Production Sheet.pdf";
String target = @"Production Sheet-withImage.pdf";

using (PdfReader sourceReader = new PdfReader(source))
using (PdfReader destReader = new PdfReader(dest))
using (PdfStamper targetStamper = new PdfStamper(destReader, File.Create(target), (char)0, true))
{
    Copy(sourceReader, "proofImage", targetStamper, "proofImage");
}

The result in Production Sheet-withImage.pdf:

enter image description here

Some words of warning

The code above is very optimistic and contains no plausibility checks. For production you should definitively make it more defensive and check for null values, empty lists, etc.

mkl
  • 90,588
  • 15
  • 125
  • 265
  • Fantastic! Thank you for the incredibly thorough answer. This will be of massive help and helped me understand several things about how pdf's and iTextSharp work. – Derek Glissman Jul 02 '18 at 16:00