Questions tagged [pymupdf]

PyMuPDF is a Python binding for MuPDF – “a lightweight PDF and XPS viewer”. MuPDF can access files in PDF, XPS, OpenXPS, CBZ (comic book archive), FB2 and EPUB (e-book) formats. NOTE: It is imported in Python as fitz.

PyMuPDF is a Python binding for – “a lightweight PDF and XPS viewer”.

can access files in PDF, XPS, OpenXPS, CBZ (comic book archive), FB2 and EPUB (e-book) formats.

These are files with extensions .pdf, .xps, .oxps, .cbz, .fb2 or .epub (so you can develop e-book viewers in Python).

PyMuPDF provides access to many important functions of MuPDF from within a Python environment.

Note on the Name fitz:

The standard Python import statement for this library is import fitz. This has a historical reason.

257 questions
0
votes
0 answers

Strange requests coming to my Flask app when idling at night

I've left my flask\flask-login application running on my PC overnight. In the morning, logs suddenly burst with the following: Of course, there are no such URLs on my server. Previous day, I was attempting to install some packages: PyMuPDF, fitz…
TEH EMPRAH
  • 1,828
  • 16
  • 32
0
votes
1 answer

PyMuPdf - missing addPage(page) method

i used PyPDF2 before and i wrote this class class pdfWriter: fh = None pdf_obj = None def __init__(self, path): if(not path.endswith('.pdf')): path += ".pdf" self.fh = open(path, 'wb') self.pdf_obj = PdfFileWriter() def…
simone viozzi
  • 440
  • 5
  • 18
0
votes
2 answers

PyMuPDF | inserted image is in the wrong place of a pdf page

I need to insert an image into some pages of a pdf and I use insertImage. Following the example I provide fitz.Rect(0, 0, 50, 50) as I want to place the image in the top left corner of the page. Works perfectly for all pdfs, but one - a scanned…
Mihail Panayotov
  • 320
  • 2
  • 11
0
votes
0 answers

Need help saving in PyMuPDF

This is a basic script that should insert a watermark image on the first page of a PDF and save it under a new name. I could do the same with the same files in pdfrw, but I'm stuck with PyMuPDF (which I would prefer to use...). The py file is in…
Krisztian
  • 21
  • 2
  • 4
0
votes
1 answer

Unable to install PyMuPDF on Mac 10.14.5

After running pip install pymupdf in my conda environment, i get an error when trying to import fitz ModuleNotFoundError: No module named 'fitz' Inside my terminal i ran pip list | grep PyMuPDF to verify installation and it returns PyMuPDF 1.14.17,…
Steve
  • 135
  • 1
  • 10
0
votes
2 answers

Anti aliasing rendered PDFs using wxPython + pymupdf

I'm new to wxPython and pymupdf, and have had a look at the samples for wxPython + pymupdf. They work, however the quality of the pdf page (rendered) is pretty poor. I'm certain this can be improved. Basically I'm looking for an anti-aliasing…
Don
  • 6,632
  • 3
  • 26
  • 34
0
votes
0 answers

Find a word with an apostrophe in PDF using pymupdf

I am using PyMuPDF from the fitz package to search and highlight words in a PDF. How to I find a word with an apostrophe in it? In my example code, text_instances will be empty. If you search for 'her' or "'", then text_instances will not be empty.…
bb_
  • 1
  • 1
0
votes
0 answers

Unable to install PyMuPDF with MuPDF - python on Mac 10.13.4

Morning To start out, let me just say I'm a python novice - so I hope this question isn't going to be stupid. I'm running Mac 10.13.4 (Beta) and am trying to get PyMuPDF working. As per https://github.com/rk700/PyMuPDF/: I've downloaded both…
-1
votes
1 answer

python pdf certificate generation

I am creating a pdf certificate using fitz python. because it contains paragraph. in the middle of paragraph I have some name age and other. I need to make it bold, how? My code: import fitz def add_paragraph_to_pdf(input_pdf_path, output_pdf_path,…
-1
votes
1 answer

PyMuPDF (Fitz) QuadPoints for re-use in the Adobe Embed API

I am trying to extract annotations from a PDF and then use that data to 'Cherry Pick' the annotations we require to display them in a clean version of the PDF using the Adobe Embed API. We are getting data fine from the PDF using PyMuPDF however…
Justin Erswell
  • 688
  • 7
  • 42
  • 87
-1
votes
1 answer

I am trying to use Fitz to extract data from a pdf that contains text in a very unstructured format. But it's returning none at the first step

Here's the code I have been trying with the output: import fitz import pandas as pd doc = fitz.open('xyz.pdf') page1 = doc[0] words = page1.get_text("words") first_annots=[] rec=page1.first_annot.rect rec Output: the output I am expecting is all…
Dev_T
  • 1
  • 2
-1
votes
1 answer

Using PyMuPDF "draw_rect" function is working inconsistent

I'm blacking out some information from several PDF's but there are some of these that the rectangles made by "draw_rect" functions are't being drawn correctly. I have checked the rectangles and they look right, that and I'm also usind the…
-1
votes
1 answer

How to flip a pdf page upside down using python?

I'm trying to flip pdf pages upside down using python. I have tried multiple libraries like PyPdf2, PyMuPDF and pdfminer. There is documentation on how to rotate a page, but that is not what I'm looking for. The closest solution I found was on one…
Ajay Alex
  • 21
  • 3
-1
votes
1 answer

What is wrong with this PDF when trying to get a word count

I am trying to write a python app to give me a word count for PDFs. I've run into something odd with this PDF though. When I extract the text from the PDF, it shows up as some sort of binary/symbol garbage. I have tried PyPDF2 and PyMuPDF libs with…
tynick
  • 33
  • 6
-2
votes
1 answer

str' object has no attribute 'getNumPages

I am writing a little program that allows the user to open a pdf file, then the program adds image 1 to pages that contain text 1, image 2 to pages that contain text 2, and save the PDF file. But I kept getting this error "str' object has no…
Zac
  • 13
  • 1
  • 5
1 2 3
17
18