PyMuPDF how do I remove annotations?

Question

I am using PyMuPDF and trying to loop through a list of strings and highlight them before taking an image and moving to the next string.

The code below does what I need but the annotation remains after each loop and I would like to remove them after the image is taken.
An example image below shows the word "command" highlighted but the previous strings "Images" and "filename" are still highlighted, since I will have hundreds of these images compiled into a report, I would like to make it stand out more clearly.

Is there something like page.remove(highlight)?

for pi in range(doc.pageCount):
    page = doc[pi]
    for tag in text_list:

        text = tag
        text_instances = page.searchFor(text)

        five_percent_height = (page.rect.br.y - page.rect.tl.y)*0.05
        five_percent_width = (page.rect.br.x - page.rect.tl.x)*0.05

        for inst in text_instances:
            inst_counter += 1
            highlight = page.addSquigglyAnnot(inst)            

            tl_pt = fitz.Point(max(page.rect.tl.x, inst.tl.x - five_percent_width), max(page.rect.tl.y, inst.tl.y - five_percent_height))
            br_pt = fitz.Point(min(page.rect.br.x, inst.br.x + five_percent_width), min(page.rect.br.y, inst.br.y + five_percent_height))

            hl_clip = fitz.Rect(tl_pt, br_pt)

            zoom_mat = fitz.Matrix(4, 4)
            pix = page.getPixmap(matrix=zoom_mat, clip = hl_clip)
            >I want to remove the annotation here

score 2 · Answer 1 · edited Apr 13 '22 at 01:18

2

Do this:

annot = page.firstAnnot
while annot:
    annot = page.delete_annot(annot)

The method delivers the annotation following the deleted one.

edited Apr 13 '22 at 01:18

pppery

3,731
22
33
46

answered Jun 11 '20 at 13:46

Jorj McKie

2,062
1
13
17

score 1 · Answer 2 · answered Sep 30 '21 at 07:38

Jorj's approach is good. However, from the documentation there are other options:

https://pymupdf.readthedocs.io/en/latest/faq.html#how-to-read-and-update-pdf-objects

This method can also be used to remove a key from the xref dictionary by setting its value to null: The following will remove the rotation specification from the page: doc.xref_set_key(page.xref, "Rotate", "null"). Similarly, to remove all links, annotations and fields from a page, use doc.xref_set_key(page.xref, "Annots", "null"). Because Annots by definition is an array, setting en empty array with the statement doc.xref_set_key(page.xref, "Annots", "[]") would do the same job in this case.

Thanks for sharing the documentation. Internally there's a high that the deleteAnnot method (now deprecated and replaced by delete_annot) would be using the approach you shared from the doc. If not, is the option shared here better? — qwertynik, Apr 12 '22 at 05:40

score 0 · Accepted Answer · edited May 25 '20 at 11:05

0

I found an acceptable solution was to just set the opacity to 0% after taking the screenshot.

pix = page.getPixmap(matrix=zoom_mat, clip = hl_clip)
highlight.setOpacity(0)
highlight.update()

edited May 25 '20 at 11:05

Wai Ha Lee

8,598
83
57
92

answered May 25 '20 at 02:54

ajcnzd

53
4

PyMuPDF how do I remove annotations?

3 Answers3