page.getTextBlocks()
Output
[(42.5, 86.45002746582031, 523.260009765625, 100.22002410888672, TEXT, 0, 0),
(65.75, 103.4000244140625, 266.780029296875, 159.59010314941406, TEXT, 1, 0),
(48.5, 86.123456, 438.292048492, 100.92920404974, TEXT, 0, 0)]
(x0, y0, x1, y1, "lines in block", block_type, block_no)
My main aim is:
to search for a text in a PDF and highlight it
The text that has to be searched can exist in a page n number of times. using tp.search(text,hit_max=1)
it could limit the maximum number of occurence but it won't solve the problem because it will select the first occurence of text but for me may be the second or the third occurence is important.
My Idea is:
getTextBlocks extracts the text as mentioned above, using this information specifically the block_no, i want to perform page.searchFor
function for that particular block. Logically it should be possible, but practically i need help on how to do it.
I would appreciate any inputs on acheiving the main aim.
Thanks