0

I am trying to render a PDF into memory using @react-pdf/renderer's pdf method then feed the resulting bytes into pdf-lib's PDFDocument object. The reason I'm doing this is because react-pdf doesn't allow for embedding (merging) other PDFs and the merging library we're using (pdf-merger-js) doesn't support returning the page numbers of the merged PDF, meaning we can't properly render our table of contents.

What is happening is that after taking the output of @react-pdf/renderer.pdf(...), encoding it as base64 and passing it to pdf-lib.PDFDocument.load(...), the preview of the resulting PDF in the browser is a bytestring like below. We've had no issue creating this PDF preview before our introduction of pdf-lib:

bytestring-pdf

The code that is doing this is as follows (toc == Table of Contents, both arguments are collections of react-pdf elements):

const mergePdfsToObjectUrl = async (toc: ReactElement, pdfPages: any[]) => {
  if (toc && pdfPages?.length > 0) {
    const pdfString = await pdf(toc);
    const string = await pdfString.toString();
    const base64string = encode(string);
    const tocPdf = await PDFDocument.load(base64string);

    // Create a new document
    const doc = await PDFDocument.create();

    // Add individual content pages
    const tocPdfPages = await doc.copyPages(tocPdf, tocPdf.getPageIndices());
    for (const page of tocPdfPages) {
      doc.addPage(page);
    }

    // Write the PDF to a file
    const pdfBytes = await doc.save();
    const url = URL.createObjectURL(new Blob([pdfBytes]));
    return url;
  }

I've tried everything from different codecs to different pipelines for transforming the react-pdf data into a usable bytestring, but nothing seems to work.

asdf
  • 2,927
  • 2
  • 21
  • 42
  • @KJ what else would you like me to include here? The issue seems to be that when `pdf-lib` is rendering the PDF from the stream outputted by `react-pdf` it is not reading it as a PDF, but rather as the string contents of that PDF. How do I get this to render correctly as a PDF? – asdf May 23 '22 at 18:27
  • Yeah that makes sense, my current thought is that the output of `pdf(toc)` is returning some primitive that gets translated into a PDF after `URL.createObjectURL(new Blob([pdfBytes]))` is called, but is not recognizable as a pdf by `pdf-lib` after using the `pdf` method from `react-pdf`. However, this is where I'm stuck as I'm not sure how this could be the case or why it is done this way. What would be the best way to test and verify this in your mind @KJ? – asdf May 24 '22 at 08:46
  • So the `pdfString` object has the ability to cast to a string, blob, and arraybuffer. We've tried each but are thinking that the binary of the pdf is incorrect somehow? Or does the PDF output here look correct to you and it's likely just a lossy encoding issue? Our team isn't that familiar with the PDF protocol so it's difficult for us to discern whether the binary itself is correct. – asdf May 25 '22 at 03:53

1 Answers1

0

Im having a similar issue but I have managed to get the pdf to display.

//take your base64string and put into Buffer then load into pdf-lib doc
const base64string = encode(string);
const pdfBuffer = Buffer.from(base64string); // you need this line
const tocPdf = await PDFDocument.load(pdfBuffer);

Then when saving convert to Unit8Array then Blob and create the URL as you did

const pdfBytes = await doc.save();
const bytes  = new Uint8Array( pdfBytes ); // you need this
const blob   = new Blob( [ bytes ], { type: "application/pdf" } ); // also add this
const url = URL.createObjectURL( blob );
return url

Now you can pass url to an iframe <iframe src={url} title="myPdf"/> and this should display correctly. The only issue I have run into is that the generated pdf from react-pdf doesnt show any images it has in it, but everything else displays ok.