0

I want to merge multiple PDF/A files and generated a new PDF/A file using java. I tried it with OpenPDF using PdfCopy class but it produced pdf document which does not conform to the PDF/A-1a standard. Also tried with pdf-box and aspose-pdf library but did not work. Those are also producing normal PDF instead of PDF/A.

Getting following output with online pdf checker (https://www.pdf-online.com/osa/validate.aspx):

File:   mergeusing_openPDF.pdf

Compliance: pdfa-1a

Result: 
       
     Document does not conform to PDF/A.

Details:

    Validating file "mergeusing_openPDF.pdf" for conformance level pdfa-1a

    The key MarkInfo is required but missing.

    The key StructTreeRoot is required but missing.

    The document does not conform to the requested standard.

    The document doesn't provide appropriate logical structure information.

    The document does not conform to the PDF/A-1a standard.

Done.

Posting a part of code of OpenPDF:

import java.awt.color.ICC_Profile;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Arrays;
import java.util.List;

import com.lowagie.text.Document;
import com.lowagie.text.pdf.PdfCopy;
import com.lowagie.text.pdf.PdfImportedPage;
import com.lowagie.text.pdf.PdfReader;
import com.lowagie.text.pdf.PdfWriter;

public class ExampleMerge {

    public static void main(String args[]) throws IOException {
        List<String> pdfAFilesToMerge = Arrays.asList("D:/file1.pdf", "D:/file2.pdf");
        String newFilePath = "D:/merge_using_openPDF.pdf";
        PdfReader pdfReader = new PdfReader(pdfAFilesToMerge.get(0));
        Document document = new Document(pdfReader.getPageSizeWithRotation(1));
        PdfCopy copy = new PdfCopy(document, new FileOutputStream(newFilePath));
        copy.setTagged();
        copy.setPDFXConformance(PdfWriter.PDFA1A);
        copy.createXmpMetadata();
        document.open();

        String iccProfilePath = "C:/ICC_Profiles/sRGB_IEC61966-2-1.icc";
        ICC_Profile icc;
        try {
            icc = ICC_Profile.getInstance(new FileInputStream(iccProfilePath));
            copy.setOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);
        } catch (IOException e) {
            e.printStackTrace();
        }

        for (int i = 0; i < pdfAFilesToMerge.size(); i++) {

            for (int j = 1; j <= pdfReader.getNumberOfPages(); j++) {
                PdfImportedPage page = copy.getImportedPage(pdfReader, j);
                copy.addPage(page);
            }
            if (i + 1 < pdfAFilesToMerge.size())
                pdfReader = new PdfReader(pdfAFilesToMerge.get(i + 1));
        }

        document.close();
        System.out.println("Documents merged");

    }
}
  • Please include the error messages of the PDF/A checker. – Tilman Hausherr Sep 28 '22 at 07:55
  • @TilmanHausherr added output of online PDF/A checker in the post. – Dnyaneshwar Rathod Sep 28 '22 at 09:32
  • Ouch, this is tricky. Apparently that software (likely very old version of itext before it went commercial) removes the StructTreeRoot from the source documents. PDFBox does attempt to merge the structure tree but also has some flaws. It would be interesting to see what the errors are there. – Tilman Hausherr Sep 28 '22 at 10:17

0 Answers0