pikepdf is a Python library for reading and writing PDF files via the qpdf library
Questions tagged [pikepdf]
31 questions
0
votes
1 answer
Copying the outlines from PDF1 to PDF2 using the pikepdf module
I wrote some code that reorders the pages of PDF1 and saves them to PDF2:
import pikepdf
def Main():
with pikepdf.open("pdf1.pdf") as sourcePDF:
sourcePages = sourcePDF.pages
targetPDF = pikepdf.Pdf.new()
targetPages =…

nidias57
- 1
- 2
0
votes
1 answer
How to Concatenate PDFs via Pikepdf and Python without Unnecessary Disk Read-Write?
Current technology stack
img2pdf==0.4.4
pikepdf==7.1.2
Python 3.10
Ubuntu 22.04
The requirement
A pdf file (let's call it static.pdf) exists in the disk. Another pdf (let's call it dynamic.pdf) is being generated dynamically in memory with img2pdf…

Della
- 1,264
- 2
- 15
- 32
0
votes
2 answers
Is there a way to parse the form fields of signed PDFs e.g. using Python or Java and write them to a CSV?
I would like to parse form fields from signed PDF's. With this I mean for example the checkboxes. I have already tried different ways (with Python) like PyPDF2, pikepdf or even pdfminer, however I only get the letters out and not the form fields. If…
0
votes
0 answers
Batch Split A3 pdfs (with two A4 pages side by side) into A4 pages and reorder them
Can anyone change any of the code on this page ....
https://stackoverflow.com/questions/13345593/split-each-pdf-page-in-two
to split a document of A3 pages into A4 pages so that they come out ordered. I need to do this to a whole folder of pdfs so…

B Boxall
- 1
- 1
0
votes
0 answers
Remove Black Rectangles from PDF using Python (PikePDF or PyPDF2)
Please help me surprise my wife with a useful PDF of her iMessage chain with her now deceased grandmother.
Apple Messages allows you to print conversations to PDF. You have to manually scroll to the top of the message on the Mac, over and over…

user1379634
- 41
- 3
0
votes
0 answers
How does one extract the actual text from pdf lines with an unrecognized encoding?
To set the stage, I am using pikepdf. When extracting a pdf, I have first upgraded it to PDF/A using ghostscript.
In PDF/A format, I can easily render it to see text. The PDF is also a "True" Pdf in the sense that everything is structured except for…

Chris
- 28,822
- 27
- 83
- 158
0
votes
0 answers
Can I decrypt a different encoded pdf stream from Azure blob storage than utf-8 with pikepdf?
I am accessing an encrypted pdf file saved in blob storage on Azure. I get the error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 10: invalid continuation byte when I run the following code:
date =…

Eliot Kim
- 63
- 8
0
votes
0 answers
Transform text contents of a PDF
I have a PDF with multiple text blocks which are misaligned. I am trying to generate a new PDF with aligned text as per my transformation matrix (known). I can use PyMuPDF (fitz) to extract the text information from the source PDF and insert the…

asymptote
- 1,133
- 8
- 15
0
votes
0 answers
Installing pikepdf on Elastic Beanstalk
I first tried adding pikepdf in requirements.txt
src/qpdf/annotation.cpp:14:10: fatal error: qpdf/QPDFAnnotationObjectHelper.hh: No such file or directory
I think it was missing a dependency. So I tried installing qpdf using yum install…

bones225
- 1,488
- 2
- 13
- 33
0
votes
3 answers
Encrypt pdf file so copy content or edit is not allowed
How can I encrypt the document so it is not allowed to edit text or should not allow copying content from pdf files?
I tried setting different user and admin passwords but still, I was able to edit the text in pdf editor.
import pikepdf
from pikepdf…

Kris
- 9
- 2
0
votes
1 answer
Not able to import pikepdf after having successfully installed it in my virtual environment
I installed pikepdf in my virtual environment in my anaconda prompt. However, when I try to import it in my Jupyter notebook, it says "no module named 'pikepdf'".
I tried upgrading my pip, close and reopen my Jupyter, but nothing seems to work.
Is…

Ren
- 1
0
votes
2 answers
How open multiple encrypted PDF and save without password in Python
I am a newbee just started my first language as Python.
I am trying to write code to open multiple encrypted pdf files and save them without password.
All files are in a folder, I have a csv file filePassword.csv with columns filename and…

Baldev
- 25
- 6
0
votes
0 answers
cannot import name 'etree' from 'lxml' in home brew installed package but fine in python shell
I'm trying to run ocrmypdf which was installed via homebrew but am having issues with my local version of lxml (version 4.2.4):
Traceback (most recent call last):
File "/usr/local/bin/ocrmypdf", line 5, in
from ocrmypdf.__main__…

Drivebyluna
- 344
- 2
- 14
-1
votes
1 answer
PikePdf installation failure only when using pyinstaller
I am trying to install use pyinstaller to turn all my scripts into an exe file. I am just using auto-py-to-exe which I believe uses pyinstaller under the hood.
However, I've come across this issue:
Traceback (most recent call last):
File…

danielliucs
- 61
- 2
- 6
-1
votes
1 answer
gcc 9.3.0 preprocessor under Cygwin: cmdline -Dname but name seems to be undefined
I'm trying to build OCRmyPDF under Cygwin and have run into a brick
wall. While I've been a developer my entire career, I've worked
mostly in Java and have little knowledge of Python internals and C++.
The problem might be obvious to an expert in…

Jim Garrison
- 85,615
- 20
- 155
- 190