Puppeteer optimalization

Question

I have a web app that generates quite large PDF documents, possibly 100 pages and more.

The workflow is this:

Generate html using nunjucks templates
Open a puppeteer browser
Create PDF front page (see code below)
Create PDF pages
Merge the pages to a document and create a buffer

import { PDFDocument } from 'pdf-lib';

const pdfHtml = await nunjucks.render(...);

const theBrowser = await puppeteer.launch({
  args: [
    '--disable-dev-shm-usage',
    '--no-first-run',
    '--no-sandbox',
    '--no-zygote',
    '--single-process',
  ],
  headless: true
});

const page = await theBrowser.newPage();

awaut page.setContent(`${pdfHtml}`, { waitUntil: 'networkidle0' });

const frontPage: Buffer = await page.pdf({
  ... someOptions,
  pageRanges: '1'
});

const pdfPages: Buffer = await page.pdf({
  ... someOptions,
  pageRanges: '2-',
  footerTemplate: ...,
});

const pdfDoc = await PDFDocument.create();
const coverDoc = await PDFDocument.load(frontPage);
const [coverPage] = await pdfDoc.copyPages(coverDoc, [0]);
pdfDoc.addPage(coverPage);

const mainDoc = await PDFDocument.load(reportPages);
for (let i = 0; i < mainDoc.getPageCount(); i++) {
    const [aMainPage] = await pdfDoc.copyPages(mainDoc, [i]);
    pdfDoc.addPage(aMainPage);
}

pdfBytes = Buffer.from(await pdfDoc.save());
// handle the bytes here

When the PDF gets really big, this operation will take quite a while and use lots of memory doing so, stalling the API until its completion. What can I do to optimalize this? Or are there other tools I can use to avoid stalling the API?

Puppeteer optimalization

0 Answers0