Questions tagged [pikepdf]

pikepdf is a Python library for reading and writing PDF files via the qpdf library

See also

31 questions
0
votes
1 answer

Copying the outlines from PDF1 to PDF2 using the pikepdf module

I wrote some code that reorders the pages of PDF1 and saves them to PDF2: import pikepdf def Main(): with pikepdf.open("pdf1.pdf") as sourcePDF: sourcePages = sourcePDF.pages targetPDF = pikepdf.Pdf.new() targetPages =…
nidias57
  • 1
  • 2
0
votes
1 answer

How to Concatenate PDFs via Pikepdf and Python without Unnecessary Disk Read-Write?

Current technology stack img2pdf==0.4.4 pikepdf==7.1.2 Python 3.10 Ubuntu 22.04 The requirement A pdf file (let's call it static.pdf) exists in the disk. Another pdf (let's call it dynamic.pdf) is being generated dynamically in memory with img2pdf…
Della
  • 1,264
  • 2
  • 15
  • 32
0
votes
2 answers

Is there a way to parse the form fields of signed PDFs e.g. using Python or Java and write them to a CSV?

I would like to parse form fields from signed PDF's. With this I mean for example the checkboxes. I have already tried different ways (with Python) like PyPDF2, pikepdf or even pdfminer, however I only get the letters out and not the form fields. If…
0
votes
0 answers

Batch Split A3 pdfs (with two A4 pages side by side) into A4 pages and reorder them

Can anyone change any of the code on this page .... https://stackoverflow.com/questions/13345593/split-each-pdf-page-in-two to split a document of A3 pages into A4 pages so that they come out ordered. I need to do this to a whole folder of pdfs so…
B Boxall
  • 1
  • 1
0
votes
0 answers

Remove Black Rectangles from PDF using Python (PikePDF or PyPDF2)

Please help me surprise my wife with a useful PDF of her iMessage chain with her now deceased grandmother. Apple Messages allows you to print conversations to PDF. You have to manually scroll to the top of the message on the Mac, over and over…
0
votes
0 answers

How does one extract the actual text from pdf lines with an unrecognized encoding?

To set the stage, I am using pikepdf. When extracting a pdf, I have first upgraded it to PDF/A using ghostscript. In PDF/A format, I can easily render it to see text. The PDF is also a "True" Pdf in the sense that everything is structured except for…
Chris
  • 28,822
  • 27
  • 83
  • 158
0
votes
0 answers

Can I decrypt a different encoded pdf stream from Azure blob storage than utf-8 with pikepdf?

I am accessing an encrypted pdf file saved in blob storage on Azure. I get the error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 10: invalid continuation byte when I run the following code: date =…
Eliot Kim
  • 63
  • 8
0
votes
0 answers

Transform text contents of a PDF

I have a PDF with multiple text blocks which are misaligned. I am trying to generate a new PDF with aligned text as per my transformation matrix (known). I can use PyMuPDF (fitz) to extract the text information from the source PDF and insert the…
asymptote
  • 1,133
  • 8
  • 15
0
votes
0 answers

Installing pikepdf on Elastic Beanstalk

I first tried adding pikepdf in requirements.txt src/qpdf/annotation.cpp:14:10: fatal error: qpdf/QPDFAnnotationObjectHelper.hh: No such file or directory I think it was missing a dependency. So I tried installing qpdf using yum install…
bones225
  • 1,488
  • 2
  • 13
  • 33
0
votes
3 answers

Encrypt pdf file so copy content or edit is not allowed

How can I encrypt the document so it is not allowed to edit text or should not allow copying content from pdf files? I tried setting different user and admin passwords but still, I was able to edit the text in pdf editor. import pikepdf from pikepdf…
Kris
  • 9
  • 2
0
votes
1 answer

Not able to import pikepdf after having successfully installed it in my virtual environment

I installed pikepdf in my virtual environment in my anaconda prompt. However, when I try to import it in my Jupyter notebook, it says "no module named 'pikepdf'". I tried upgrading my pip, close and reopen my Jupyter, but nothing seems to work. Is…
Ren
  • 1
0
votes
2 answers

How open multiple encrypted PDF and save without password in Python

I am a newbee just started my first language as Python. I am trying to write code to open multiple encrypted pdf files and save them without password. All files are in a folder, I have a csv file filePassword.csv with columns filename and…
Baldev
  • 25
  • 6
0
votes
0 answers

cannot import name 'etree' from 'lxml' in home brew installed package but fine in python shell

I'm trying to run ocrmypdf which was installed via homebrew but am having issues with my local version of lxml (version 4.2.4): Traceback (most recent call last): File "/usr/local/bin/ocrmypdf", line 5, in from ocrmypdf.__main__…
Drivebyluna
  • 344
  • 2
  • 14
-1
votes
1 answer

PikePdf installation failure only when using pyinstaller

I am trying to install use pyinstaller to turn all my scripts into an exe file. I am just using auto-py-to-exe which I believe uses pyinstaller under the hood. However, I've come across this issue: Traceback (most recent call last): File…
-1
votes
1 answer

gcc 9.3.0 preprocessor under Cygwin: cmdline -Dname but name seems to be undefined

I'm trying to build OCRmyPDF under Cygwin and have run into a brick wall. While I've been a developer my entire career, I've worked mostly in Java and have little knowledge of Python internals and C++. The problem might be obvious to an expert in…
Jim Garrison
  • 85,615
  • 20
  • 155
  • 190