1

I have a pdf document and for simplicity, I want to make two (many) different edited versions of the same pdf.

For example, in one of the pdf, I want all the "and" in the pdf to be highlighted, and in the second I want all "the" to be highlighted.

I tried doing it like this using PyMuPDF:

import fitz
pdf = "mypdf.pdf"
doc = fitz.open(pdf)

text = ["and"]
for j in text:
    i = page.searchFor(j)
     for inst in i:
        highlight = page.addHighlightAnnot(inst)
doc.save("output_and.pdf", garbage=4, deflate=True, clean=True)

text = ["the"]
for j in text:
    i = page.searchFor(j)
    for inst in i:
        highlight = page.addHighlightAnnot(inst)
doc.save("output_the.pdf", garbage=4, deflate=True, clean=True)

Here the first file (output_and.pdf) has contents as expected but in the second file (output_the.pdf) has both "and" and "the" highlighted. Is there a way to unhighlight "and" and then save or save the file in such a way that it does not effect the next time I saves the pdf.

yoyo yoyo
  • 21
  • 1
  • 3

0 Answers0