Questions tagged [pymupdf]

PyMuPDF is a Python binding for MuPDF – “a lightweight PDF and XPS viewer”. MuPDF can access files in PDF, XPS, OpenXPS, CBZ (comic book archive), FB2 and EPUB (e-book) formats. NOTE: It is imported in Python as fitz.

PyMuPDF is a Python binding for mupdf – “a lightweight PDF and XPS viewer”.

mupdf can access files in PDF, XPS, OpenXPS, CBZ (comic book archive), FB2 and EPUB (e-book) formats.

These are files with extensions .pdf, .xps, .oxps, .cbz, .fb2 or .epub (so you can develop e-book viewers in Python).

PyMuPDF provides access to many important functions of MuPDF from within a Python environment.

Note on the Name fitz:

The standard Python import statement for this library is import fitz. This has a historical reason.

257 questions

votes

2 answers

Crop PDF content with Python, not just the cropbox

I am trying to create a script that crops parts of a PDF, merges them into a single page, and saves the result to another PDF file. The problem is that when I change the crop box and merge the page, it keeps the cropped data and just hides it. This…

python crop pypdf pymupdf

asked Dec 30 '22 at 16:40

Igor Micadei

votes

1 answer

How to make an inserted text visible in pdf using pyMuPdf

I have inserted a text in an existing pdf document using page.insert_text function of pyMuPdf. However, on saving the document, the inserted text is not visible on the page at the location. There is an image that appears on the foreground and the…

python pdf pymupdf

asked Dec 28 '22 at 08:35

wndev1

votes

1 answer

Covert Rect location from pymupdf to a page number

Covert Rect location from pymupdf to a page number If I get the locations of certain text like "exam" and get the rectangle location. I then highlight the text in the pdfs with that location. I now want to delete all other pages that do not have…

python pdf pymupdf

asked Dec 22 '22 at 22:04

GCIreland

votes

1 answer

Extract Text in Natural reading order using pymupdf (fitz)

I am trying to extract the text using pymupdf or flitz by applying this tutorial https://towardsdatascience.com/extracting-headers-and-paragraphs-from-pdf-using-pymupdf-676e8421c467 instead of blocks = page.getText("dict")["blocks"] I wrote blocks =…

python pdf text-extraction pymupdf

asked Dec 20 '22 at 02:33

user116936

votes

2 answers

How to close a pdf opened with fitz if I've replaced its variable name?

This is a simple issue. I use jupyter notebook for python and usually deal with pdfs using pymupdf. I usually define pdf = fitz.open('dir/to/file.pdf') but somethimes I forget to close the file before i redefine pdf =…

python pdf pymupdf

asked Dec 10 '22 at 20:51

José Chamorro

votes

1 answer

How to use Python Fitz detect Hyphen when using search_for?

I'm new to the Fitz library and am working on a project where I need to find a string in a PDF page. I'm running into a case where the text on the page that I'm searching on is hyphenated. I am aware of the TEXT_DEHYPHENATE flag that I can use in…

python pymupdf python-pdfkit python-pdfreader

asked Dec 01 '22 at 20:03

Kevin Wu

votes

1 answer

Problem with the 'deflate' parameter of Pymupdf and Acrobat Reader

My program is redacting sensible information from PDF files. While saving the redacted PDF, I'm passing a few parameters to avoid exporting oversized files : doc.save( file_path, permissions=fitz.PDF_PERM_PRINT, owner_pw="owner", …

python pdf pdf-generation acrobat pymupdf

asked Nov 29 '22 at 10:41

junsuzuki

votes

0 answers

Can't get the text from pdf

When i try to parse the pdf, i can't get the content of pdf but getting random symbols and characters. What is the reason behind it? This should give the proper text. I have tried using PyPDF2 also still can not get the text. filename =…

pdf text extract text-extraction pymupdf

asked Nov 28 '22 at 13:07

Hemil Parmar

votes

2 answers

PyMuPDF - How to Data Extract from Unstructured PDFs using PyMuPDF in python?

I am following this guide on how to extract data from Unstructured PDFs using PyMuPDF. https://www.analyticsvidhya.com/blog/2021/06/data-extraction-from-unstructured-pdfs/ I am getting an AttributeError: 'NoneType' object has no attribute 'rect'…

python csv pdf pypdf pymupdf

asked Oct 31 '22 at 21:54

Mech_Saran

votes

1 answer

PyMuPDF: skipping bad link / annot item 0

I use PyMuPDF's insert_link to add links to a PDF. But when I do it, I sometimes get the warning skipping bad link / annot item 0. When I highlight the same rect with add_highlight_annot the area is highlighted. There is just no link. This happens…

python pymupdf

asked Oct 20 '22 at 17:24

Mazze

votes

2 answers

is there any way to find text using dimensions using pymupdf?

import fitz doc = fitz.open("" List item ) for page in doc: print(page.search_for("Bank Account")) this program is for get dimensions of given text. i want to do reverse of it, find text using its dimensions.

python pymupdf pdf-extraction

asked Oct 12 '22 at 06:16

chintan bhimani

votes

2 answers

Python - Go through only 5 pages at one time in PyMuPdf Fitz

I want to iterate through the last 5 pages of a PDF in PyMuPdf, and ask the user if he wants to iterate through more 5 pages. I came across reversed method of PyMuPdf, but that doesn't take the parameter of limiting it to only 5 pages. Example,…

python pymupdf

asked Sep 26 '22 at 09:24

donny

votes

0 answers

form fields are not showing values when filling form with pymupdf

I have a template pdf https://www.irs.gov/pub/irs-pdf/f2848.pdf that I want to fill fields with csv data. My script is: template = '..\\..\\02. Inputs\\f2848.pdf' doc=fitz.open(template) df = pd.read_csv('..\\..\\02. Inputs\\ 2848…

python pdf pymupdf

asked Sep 16 '22 at 14:29

katy

votes

0 answers

Running setup.py install for pymupdf did not run successfully

I am attempting to install PyMuPDF on my Mac in a Jupyter Notebook, and when I run the command pip install PyMuPDF I receive back the following error: Running setup.py install for pymupdf did not run successfully. note: This is an issue with the…

python pdf pymupdf pdf2image

asked Sep 14 '22 at 21:42

AScientist1096

votes

2 answers

Python AttributeError: 'Page' object has no attribute 'insertImage'

I'am trying to add a png sign to the PDF by using a python code and the code that i am running is I am using PyMuPDF and have used fitz library. import fitz input_file = "example.pdf" output_file = "example-with-sign.pdf" barcode_file =…

python pdf compiler-errors pymupdf

asked Sep 07 '22 at 09:46

Gökçe Yılmaz

Prev 1 2 3

…

17 18 Next