2

I'm doing some basic keyword highlighting, but I'm running into a strange issue. When I enter a stroke color with floating point RGB values (as shown below), the highlights come out in multiple different colors. In this case, I want the highlights to be orange. Some words are orange while others are more red. Any idea what's going on here? If I change the the RGB values to integers, the highlight colors are all the same. Please let me know if there's any additional code you need to see.

annot = page.add_highlight_annot(word)
annot.set_colors(stroke=(1, 0.5, 0))
annot.update()

With floating point - (1, 0.5, 0)

With integers - (1, 1, 0)

almosthavoc
  • 159
  • 10
  • PDF colors are always **_floats_** in range 0 <= f <= 1. Otherwise this indeed looks strange. I would need an example file to reproduce the behavior. – Jorj McKie Jun 22 '23 at 07:47
  • @JorjMcKie I am using the J&J 2022 10-K. The specific snippet I took was from the first paragraph on page 4. https://johnsonandjohnson.gcs-web.com/static-files/9b012500-471a-4df9-93fc-6cee2b420678 – almosthavoc Jun 22 '23 at 17:47
  • 1
    Think I found the problems: (1) your choice of color (2) your program logic. You must be highlighting the same text multiple times sometimes. With this color choice, the second highlight darkens the background for every overlap (blend mode "Multiply"!). You need additional logic, that guarantees that every to-be highlighted rectangle is ony highlighted once. See below image, where neighbored single word highlights overlap! – Jorj McKie Jun 23 '23 at 18:30
  • 1
    @JorjMcKie Thank you so much! That was exactly it. I was regex matching the words and then doing a page.search_for() on the word and highlighting those words. Since my logic didn't account for the fact that there could be multiple occurrences of the same words returned by the match, it was going through the search and highlighting words that had already been highlighted. Awesome call out on the blend multiplication as well, makes total sense why I was only noticing my bug with floats. – almosthavoc Jun 23 '23 at 21:29

1 Answers1

1

Here is an image where all words are highlighted. Neighbouring words show a small color overlap, that is significantly darker than the rest. enter image description here

Jorj McKie
  • 2,062
  • 1
  • 13
  • 17