Python Text annotation with PyMuPDF

Question

I'm using PyMuPDF for annotating some text in . pdf document by using:

import fitz
import re


def data_(text):
    annotation_text = r"(amet)"
    for line in text:
        if re.search(annotation_text, line, re.IGNORECASE):
            search = re.search(annotation_text, line, re.IGNORECASE)
            yield search.group(1)


def includeannotation(path_included):
    document = fitz.open(path_included)

    for page in document:
        page.wrap_contents()
        obs = data_(page.getText("text").split("\n"))
        # print (obs)
        for data in obs:
            catchs = page.searchFor(data)
            [
                page.addRedactAnnot(catchs, fontsize=11, fill=(0, 0, 0))
                for catch in catchs
            ]
        page.apply_redactions()
    doc.save("annotation.pdf")
    print("end - created")


path_included = "/content/document.pdf"

save_document = includeannotation(path_included)

The source .pdf document contains the text: By applying the above mentioned code, I can include the annotation for the text "amet" obtain the following result:

And the result seems to be in line with the expection, but you can see that the library has included the annotation in black (for "amet") also deleting the word in the line after, but not with the black annotation. And in fact it looks like a restyling problem.

How can I avoid such problem?

Python Text annotation with PyMuPDF

0 Answers0