Questions tagged [pymupdf]

PyMuPDF is a Python binding for MuPDF – “a lightweight PDF and XPS viewer”. MuPDF can access files in PDF, XPS, OpenXPS, CBZ (comic book archive), FB2 and EPUB (e-book) formats. NOTE: It is imported in Python as fitz.

PyMuPDF is a Python binding for mupdf – “a lightweight PDF and XPS viewer”.

mupdf can access files in PDF, XPS, OpenXPS, CBZ (comic book archive), FB2 and EPUB (e-book) formats.

These are files with extensions .pdf, .xps, .oxps, .cbz, .fb2 or .epub (so you can develop e-book viewers in Python).

PyMuPDF provides access to many important functions of MuPDF from within a Python environment.

Note on the Name fitz:

The standard Python import statement for this library is import fitz. This has a historical reason.

257 questions

votes

1 answer

Find and mark words in a PDF EXCEPT some words python

I got this part of code: kwfile = fitz.open(filedialog.askopenfilename()) # the keywords PDF # the following extracts kwfile content as plain text across all pages: text = " ".join([page.get_text() for page in kwfile]) keywords =…

python pdf exception mupdf pymupdf

asked Feb 09 '23 at 08:26

Furk276

votes

1 answer

How to extract only a Rect object in PyMuPDF

I tried the solution from this thread here: Read specific region from PDF Sadly the following example from the thread by user Zach Young doesn't work for me. import os.path import fitz from fitz import Document, Page, Rect # For visualizing the…

python extract text-extraction pymupdf

asked Feb 04 '23 at 15:13

von spotz

votes

1 answer

Python: How to sort a list of Rect objects?

I made a pdf reader that searches for a specific value and makes a list. I use PymuPDF which is incredible. So now I have this list and I would like to sort it with the following logic: first Rect is the top, left most Rect each following Rect is…

python sorting pymupdf

asked Feb 03 '23 at 18:10

Dat_guy_who_hangs_out

votes

1 answer

Opening PDF within a zip folder fitz.open()

I have a function that opens a zip file, finds a pdf with a given filename, then reads the first page of the pdf to get some specific text. My issue is that after I locate the correct file, I can't open it to read it. I have tried to use a relative…

python python-zipfile pymupdf

asked Feb 02 '23 at 17:35

Ryan

votes

0 answers

Pyside6 shearing PDF file on window resize

I'm using QT (PySide) to view PDFs (using the PyMuPDF library) but when I resize I get a shearing artifact. Like this: Here is a minimal example: import sys import fitz from PySide6.QtWidgets import QApplication, QLabel, QMainWindow,…

qt pyside pyside6 pymupdf

asked Feb 02 '23 at 16:40

Matt Harrison

1,225
11
12

votes

0 answers

Obtaining margin sizes of a pdf using PyMuPDF

Using PyMuPDF, is there any way to get the page margins? I mean the distance from the edge of the page to the nearest horizontal/vertical element, depending on whether it is left/right or top/bottom margin. Looking at the documentation I don't see…

python pdf-generation pymupdf

asked Jan 29 '23 at 00:46

Kikolo

votes

1 answer

Using bezier curves to draw a rectangle with rounded corners in PyMuPDF

I would like to use PyMuPDF to draw a rectangle with rounded corners in a pdf. Apparently, there are no particular methods for rounded rectangles. But I was wondering if Shape.draw_bezier() or Shape.draw_curve() could be used for that purpose,…

python pdf-generation pymupdf

asked Jan 27 '23 at 19:10

Kikolo

votes

1 answer

How can I disentangle seemingly different imported Python modules under the same version number?

I recently updated PyMuPDF/fitz and so updated my code that uses it to update my use of fitz methods to match the updated naming convention (see PyMuPDF > Deprecated Names). Problem: when I call a function I wrote to use fitz's Page.get_text() it…

python debugging pymupdf

asked Jan 27 '23 at 05:30

danielsgriffin

votes

1 answer

Creating and then modifying pdf file in python

I am writing some code that merges some pdfs from their file paths and then writes some text on each page of the merged document. My problem is this: I can do both things separately - merge pdfs and write text to a pdf - I just cant seem to do it…

python pypdf xlwings pymupdf

asked Jan 25 '23 at 22:40

Jon Percival

votes

1 answer

PyMuPDF get optimal font size given a rectangle

I am making an algorithm that performs certain edits to a PDF using the fitz module of PyMuPDF, more precisely inside widgets. The font size 0 has a weird behaviour, not fitting in the widget, so I thought of calculating the distance myself. But…

python pymupdf

asked Jan 20 '23 at 21:32

Clement Genninasca

votes

1 answer

pymupdf detect two paragraph which text blocks coordinates is closed as one

I face a problem that When I use fitz to detect pdf layout. The two paragraph will be detect as one textblock if the two block as a close line margin. for example. I want detect the text and the isolated formula as to text blocks. but for now fitz…

textblock pymupdf

asked Jan 19 '23 at 07:27

CAO RUI

votes

2 answers

Why does extracting file data in PyMuPDF give me empty lists?

I am new to programming (just do it for fun sometimes) and I am having trouble using PyMuPDF. In VS Code, it returns no errors but the output is always just an empty list. Here is the code: > import fitz file_path =…

python pymupdf

asked Jan 16 '23 at 19:38

john da bon

votes

1 answer

python fitz page.add_highlight_annot(start=pointa, stop=pointb) not working

i'm trying to highlight a text in a pdf from a start word "pointa" to an end word "pointb" but it wont work it will mark all the text on the page Maybe some one could help me (pleas) and figure out what i'm doing wrong. import fitz import…

python pymupdf

asked Jan 09 '23 at 13:40

kalimero00

votes

2 answers

Is there an efficient way to executing a program with similar names using python in the terminal?

I'm trying to process PDFs using PyMuPDF and I'm running this python file called process_pdf.py in the terminal. > import sys, fitz > fname = sys.argv[1] # get document filename > doc = fitz.open(fname) # open document > out = open(fname + ".txt",…

python linux terminal pymupdf

asked Jan 08 '23 at 01:32

Bryant Tan

votes

1 answer

In PyMuPDF what does the string of letters at the start of a Font name represent?

As can be seen in the documentation PyMuPDF get_page_fonts the returned set of fonts have names like FNUUTH+Calibri-Bold or DOKBTG+Calibri. What do the string prefixs (FNUUTH+, DOKBTG+) represent?

fonts pymupdf

asked Jan 03 '23 at 18:56

Tolure

Prev 1 2 3

…

17 18 Next