Questions tagged [pymupdf]

PyMuPDF is a Python binding for MuPDF – “a lightweight PDF and XPS viewer”. MuPDF can access files in PDF, XPS, OpenXPS, CBZ (comic book archive), FB2 and EPUB (e-book) formats. NOTE: It is imported in Python as fitz.

PyMuPDF is a Python binding for mupdf – “a lightweight PDF and XPS viewer”.

mupdf can access files in PDF, XPS, OpenXPS, CBZ (comic book archive), FB2 and EPUB (e-book) formats.

These are files with extensions .pdf, .xps, .oxps, .cbz, .fb2 or .epub (so you can develop e-book viewers in Python).

PyMuPDF provides access to many important functions of MuPDF from within a Python environment.

Note on the Name fitz:

The standard Python import statement for this library is import fitz. This has a historical reason.

257 questions

votes

0 answers

Python PDF parsing script fails :- mupdf: malloc of 51301 bytes failed

I'm attempting to parse data from around 53k pdfs stored on disk. The script I have iterates through a dataframe of filenames of pdfs and has a function which returns bounding boxes for each pdf and for each bbox parses the text data within that…

python pandas parsing pdf pymupdf

asked May 25 '22 at 09:57

furbaw

votes

2 answers

Can't install PymuPDF although python Libary have PymuPDF

I tried to install PyMuPDF on Python 3.9 when first I installed by pip install PymuPDF and re-checked by pip list like this" But when I imported PyMuPDF: ModuleNotFoundError: No module named 'PyMuPDF' Next, I tried to install PymuPDF from doc, it…

python-3.x pyinstaller pymupdf

asked Mar 26 '22 at 14:49

Duchero Nguyên

votes

0 answers

Add library/module to server

I am pretty new to python and would like to use the PyMuPDF library on a web server in order to modify PDFs. The problem is, I am unable to add/install any modules or libraries to/on the server. Is there a way to install all libraries and modules in…

python server module pip pymupdf

asked Mar 09 '22 at 16:14

jonsken

votes

0 answers

Number of Entries in Xref Table

Is there any java library by which I can get a number of entries in the Xref Table of PDF? Document.xref_length() pyMuPdf has this, but I want it in java.

pdf pdfbox pymupdf

asked Feb 08 '22 at 07:36

maester

votes

0 answers

Zoom and crop a pdf document using PyMuPDF fitz and saving as pdf

I am trying to crop a pdf within and lambda and save the file. Ideally I just want to zoom in as otherwise the OCR package does not recognize some of the fonts. The rectangle I am using just seems to shift the margins versus actually cropping or…

python pdf lambda pymupdf

asked Feb 05 '22 at 14:04

megv

1,421
5
24
36

votes

1 answer

AttributeError: 'Document' object has no attribute 'searchFor

I want to write a simple program that asks the user to open a PDF file from any location, add image A to any page that contains the keywords "Orange County", and add image B to any page that contains the keywords "Hillsborough county", then save the…

python file pdf syntax pymupdf

asked Jan 29 '22 at 19:03

Zac

votes

2 answers

I'm trying to read pdf one by one and then converting it into dataframe

I've used 'fitz' from Pymupdf module to extract data and then with pandas converting the extracted data to dataframe. #Code to read multiple pdfs from the folder: from pathlib import Path # returns all file paths that has .pdf as extension in the…

python dataframe pdf pathlib pymupdf

asked Jan 25 '22 at 13:49

User1011

votes

0 answers

PyMuPDF (fitz) not properly closing files, resulting in PermissionError [WinError 32]

I can't figure out why I'm getting a PermissionError when trying to clean up some temporary pdf files that are no longer needed. My script downloads a bunch of single page pdf's into a /temp folder, then uses PyMuPDF to merge them into a single pdf.…

python pymupdf

asked Jan 04 '22 at 02:55

theflyingbelgian

votes

1 answer

Why does pymupdf have an origin that is not in the top left corner?

I don't seem to be able to figure out why pymupdf tools for placing objects on pdf documents has the origin set at a seemingly random location. Notice that (0,0,100,100), which is x0 y1 x2 y2 (where y starts from top) starts from the middle of the…

python drawing pymupdf

asked Dec 21 '21 at 12:59

negfrequency

1,801
3
18
30

votes

1 answer

How to add background image in pdf using Pymupdf module in python

I am trying to add the background image in pdf using Pymupdf but it is creating a layer between pdf and image as you can see the output. How can I bypass(remove) the layer between pdf and backround image? please help me regrading this. This is how I…

python pdf pymupdf

asked Nov 18 '21 at 08:44

Prabhat

votes

0 answers

Extracting html structure from PDF

I have a test pdf file with just a 3x3 table that are marked properly with table headings and the sort. What I want to do is extract the format of the table. Like so: left center right One Two Three If that table was in the pdf, I want…

python pdf pymupdf

asked Nov 03 '21 at 17:24

Mat

votes

0 answers

draw_rect method of Pymupdf is not working on certain pages of pdf

I'm using draw_rect method of Pymupdf. It's not working on certain pages of the pdf. Following is the code for drawing rectangles. I tried the same rect values to plot on other pages and it plotted correctly. doc = fitz.open(filepath ) x0,y0,x2,y2 =…

python-3.x pdf pymupdf

asked Oct 20 '21 at 14:07

Nayana Madhu

1,185
5
17
34

votes

1 answer

Python: mupdf: invalid key in dict

I am writing below code to remove annotations from a pdf file and then save it to new pdf. However, I am getting RuntimeError: invalid key in dict. Below is the Code: import fitz import re doc = fitz.open("test.pdf") for i in range(doc.pageCount): …

python pdf pymupdf

asked Oct 11 '21 at 08:48

Sundaram

votes

1 answer

How can I transfer annotations between PDFs (e.g. using pymupdf)

I have been looking through the pymupdf documentation, and while there is a lot there and I can see how to identify annotations (Annot class), I can't work out how to put an annotation that I have found in one document from that one into another.…

pymupdf pdf-annotations

asked Sep 30 '21 at 05:43

Diomedea

votes

0 answers

How to attach images using pymupdf

I have a pdf where 2 pages have total of 6 attachment boxes where you can click on them and after clicking you can choose the image file and it will be inserted in the pdf, so I want to do this using python I have tried pymupdf and after checking it…

python pypdf pdfminer pymupdf

asked Sep 02 '21 at 20:06

Mr Anonymous

Prev 1 2 3

…

17 18 Next