Questions tagged [pymupdf]

PyMuPDF is a Python binding for MuPDF – “a lightweight PDF and XPS viewer”. MuPDF can access files in PDF, XPS, OpenXPS, CBZ (comic book archive), FB2 and EPUB (e-book) formats. NOTE: It is imported in Python as fitz.

PyMuPDF is a Python binding for mupdf – “a lightweight PDF and XPS viewer”.

mupdf can access files in PDF, XPS, OpenXPS, CBZ (comic book archive), FB2 and EPUB (e-book) formats.

These are files with extensions .pdf, .xps, .oxps, .cbz, .fb2 or .epub (so you can develop e-book viewers in Python).

PyMuPDF provides access to many important functions of MuPDF from within a Python environment.

Note on the Name fitz:

The standard Python import statement for this library is import fitz. This has a historical reason.

257 questions

votes

3 answers

How do I merge items from a list avoiding repeated content inside the items?

Edit 4: Simpler example of what I want to do: I have a list like this: sentences = ['Hello, how are','how are you','you doing?'] And I want to turn it into a string like this: sentence = 'Hello, how are you doing?' Any help is appreciated! Original…

python python-docx pymupdf

asked Aug 14 '21 at 13:28

JupiterJones

votes

1 answer

PyMuPDF ModuleNotFoundError

I successfully ran the command: pip install pymupdf Successfully installed pymupdf-1.18.15 However, both import fitz and import pymupdf both output an ModuleNotFoundError. Why is python giving me a ModuleNotFoundError?

python pymupdf

asked Aug 04 '21 at 00:05

Doing and Learning

votes

0 answers

How to use lxml to parse xml extract of pymupdf?

So I read each page of a pdf and appended every xml extract to a string variable. Using Page.get_text(“xml”). The text output consisted of many units of \n

python xml xml-parsing lxml pymupdf

asked Aug 03 '21 at 19:08

Vishak Arudhra

votes

1 answer

PyMuPDF - Scale a Quad from center in all directions

I'm searching for text in a pdf and extracting a quad and adding a polygon_annot around it. But I would like to scale the polygon_annot. How can I do that? Below is my code: for inst in text_instances: inst = inst.transform(fitz.Matrix(2, 2)) …

python mupdf pymupdf

asked Jul 28 '21 at 05:51

Gangula

5,193
4
30
59

votes

2 answers

Why does PyMupdf Document show the error, no attribute 'new_page', when it is a PDF?

I'm working on annotating a PDF and I want to change its color. I was guided to this helpful link: https://pymupdf.readthedocs.io/en/latest/faq.html#how-to-add-and-modify-annotations I used the code in the link: # -*- coding: utf-8…

python annotations pymupdf

asked Jun 30 '21 at 15:12

Katie Melosto

1,047
2
14
35

votes

1 answer

Comparing keywords with PDF files

Here is the program that called the files through folder name and extract data. Now i want to compare the data with the keywords that I used in the program below. But it gives me: pdfReader = pdfFileObj.loadPage(0) AttributeError:…

python pdf pymupdf python-pdfreader

asked Jun 24 '21 at 07:42

Abrar Hussain

votes

1 answer

How to read pdf files with pymupdf in PyQt5?

I want to open pdf file through pilihfile pushbutton, then take its name to display on textEdit and display its pdf contents on textEdit_2 by using pymupdf. But i got error said cannot open ('D:/Kuliah/KRIP.pdf', 'PDF Files (*.pdf)'): Invalid…

python pyqt pyqt5 pymupdf

asked Jun 22 '21 at 09:39

Henry

votes

1 answer

Image replacement using PyMuPDF

I'm using PyMuPDF to replace images. But when I have a dictionary of images mapped to their bbox coordinates only the image in the first page gets replaced. How can I get all the images in the dictionary to be replaced? Here's my code: 'bbval' is…

python image image-processing pymupdf

asked Jun 21 '21 at 08:20

vbadwaj

votes

0 answers

pyqt multithreading: why the worker thread blocks the main thread

when I try to load some .pdf which size>10MB or pages>300 , the worker thread will block the main thread , I don't know how to use QThread correctly, I want by each time the pixmap_page_load run , the signal is emitted to the main thread. here is…

python pyqt pymupdf

asked Jun 15 '21 at 12:54

nevermind_15

votes

1 answer

How can I avoid extracting small image elements from PDF file in python?

I am trying to extract all the images from this PDF file:…

python extract pymupdf

asked Jun 10 '21 at 05:29

Suraj Kadam

votes

0 answers

Extra svg and text from PDF in python

I need to get text and svgs incorporated in pdf in python. I tried PyDF2, PyPDF4, tika did not work. I tried using pymupdf but getting below error. Can some help me with it. I am using python 3.8, pycharm. All modules required for pymupdf are…

python pdf pymupdf

asked Jun 03 '21 at 14:22

Rohit T

votes

0 answers

My python exe file is not working in a share disk but works in jupyter notebook

I write a python script to read the pdfs files in the current folder(inside shared disk) looking for specific number and then search in other folder (same shared disk) that number. If match, with PyMuPDF I merge both files in a new file. After that,…

python pyinstaller pymupdf

asked Apr 29 '21 at 22:42

Facundo Arroyo

votes

1 answer

Sclicing with pymupdf

I'd like to mark several keywords in a pdf document using Python and pymupdf. The code looks as follows (source: original code): import fitz doc = fitz.open("test.pdf") page = doc[0] text = "result" text_instances = page.searchFor(text) for…

python pdf pymupdf

asked Apr 15 '21 at 19:10

danik

votes

1 answer

How do I delete line break in PDF text extraction in Python?

I used PyMuPDF to get the text in the PDF, here is my code import fitz pdf_document = "KRIP.pdf" doc = fitz.open(pdf_document) page1 = doc.loadPage(0) page1text = page1.get_text() print("Text from PDF: ", page1text) the output should…

python pymupdf

asked Mar 23 '21 at 08:29

brianK

votes

0 answers

Replace text in a pdf file in Python using Fitz

Does anyone have tried before to replace text from a PDF File using Fitz of PyMuPDF Library ? i have tried to use the code below and i am not sure if i am close to the result or it's impossible to use using this library: import fitz file_name =…

python pdf pymupdf

asked Mar 16 '21 at 19:37

REDA DRISSI

Prev 1 2 3

…

17 18 Next