Current technology stack
- img2pdf==0.4.4
- pikepdf==7.1.2
- Python 3.10
- Ubuntu 22.04
The requirement
A pdf file (let's call it static.pdf
) exists in the disk. Another pdf (let's call it dynamic.pdf
) is being generated dynamically in memory with img2pdf library, depending on some user input parameters.
The task is to concatenate these two pdfs as a single one (static.pdf
, then dynamic.pdf
) and send it as an email attachment via the SMTP library.
Current Solution I am Employing
This is based on the pikepdf documentation.
- Dump
dynamic.pdf
in the disk - Read
static.pdf
from disk with pikepdf - Read
dynamic.pdf
from disk with pikepdf - Concatenate them with list-like API provided by pikepdf, let's call this
final.pdf
. - Dump
final.pdf
on disk with pikepdf api - Read it from disk with
open(file='final.pdf', mode='rb')
as bytes - Attach the bytes to the email message
What I want
Remove all the unnecessary disk-I/O when I already have dynamic.pdf
in memory, and the final result is needed to be attached to email as bytes (no need to persist on disk). So ideally, the only disk operation should be reading static.pdf
.
But I cannot find much information on the pikepdf site about in-memory concatenation. Moreover, I am also not certain whether a pikepdf.Pdf
object can expose the exact same bytes as what I would get if I dump the pdf on disk and then read it using python native open
function.
So any ideas around this would be helpful, even if there are other libraries that allow this functionality. The constraints on other libraries would be
- Plays will with my tech stack (python, Ubuntu and also needs to run on windows)
- FOSS, and trustworthy enough