2

I'm generating documents using Puppeteer, and I'm trying to have the PDF Pass the accessibility report in Acrobat Reader.

I'm getting a "Tab Order - Failed", which I can fix manually going into Page Options and switching the Tab Order property from "Unspecified" to "Use Document Structure" like in this screenshot:

enter image description here

Is there a way for me to do this automatically since I need this for all the PDFs my clients generate? Puppeteer doesn't seem to have an option but if I had another library that does this, or an understanding of which part of the PDF do I need to change.

Thank you

Andrei
  • 1,183
  • 2
  • 19
  • 40

2 Answers2

2

The accepted answer by @Foxlab led me to a simpler approach:

import fs from 'node:fs/promises';
import { PDFDocument, PDFName } from 'pdf-lib';

const pdfData = await fs.readFile('your-pdf-document.pdf');
const pdfDoc = await PDFDocument.load(pdfData);

pdfDoc.getPages().forEach((page) => {
  page.node.set(PDFName.of('Tabs'), PDFName.of('S'));
});
Alex
  • 126
  • 1
  • 5
  • Hi Alex, this is far better. If you find a way to solve this one please give me clue :) https://stackoverflow.com/questions/74867631/pupeteer-generate-non-tagged-svg-path-elements-in-pdf – Foxlab May 11 '23 at 09:51
1

Facing the same problem I used the JS lib "pdf-lib" (https://pdf-lib.js.org/docs/api/classes/pdfdocument) to edit the content of the pdf file and add the missing attribute ("Tabs: "S") on pages with annotations.

  const pdfLib = require('pdf-lib');
  const fs = require('fs/promises');

  function getNewMap(pdfDoc, str){
    return pdfDoc.context.obj(
      {
        Alt: new pdfLib.PDFString(str),
        Contents: new pdfLib.PDFString(str)
      }
    ).dict;
  }

  const pdfData = await fs.readFile('your-pdf-document.pdf.pdf');
  const pdfDoc = await pdfLib.PDFDocument.load(pdfData);
  pdfDoc.context.enumerateIndirectObjects().forEach(_o => {
    const pdfRef = _o[0];
    const pdfObject = _o[1];
    if (typeof pdfObject?.lookup === "function"){
      // We look for "Page" object
      if (pdfObject.lookup(pdfLib.PDFName.of('Type'))?.encodedName === "/Page"){
        // We look for Page with Annotations
        const matchedPage = arrPagesWithAnnots.find(_p => _p.ref.objectNumber === pdfRef.objectNumber);
        if( matchedPage ){
          // We add the "Tabs" attribute with "S" value, means "Based on Structure" (https://opensource.adobe.com/dc-acrobat-sdk-docs/standards/pdfstandards/pdf/PDF32000_2008.pdf)
          // We build the new map with "Tabs attributes with "S" value and assign it to the "Page" object dictionnary
          const newTabs = pdfDoc.context.obj({Tabs: "S"}).dict;
          const newdict = new Map([...pdfObject.dict, ...newTabs]);
          pdfObject.dict = newdict;
        }
      }
    }
  })

  // We save the file
  const pdfBytes = await pdfDoc.save();
  await fs.promises.writeFile("print-accessible.pdf", pdfBytes);
Foxlab
  • 564
  • 3
  • 9