Following this script, I could know the bounding box of the tables in my e-pdf:
tabula.read_pdf(file, stream=True,guess=True,lattice=False,multiple_tables=True, output_format="json", pages=pg_num)
However, I want to plot the bounding boxes detected on the image. I realised that pixels or locations changed from x,y,w,h from the tabula bounding boxes are different from the images converted from the pdf using this script:
from pdf2image import convert_from_path
pages = convert_from_path(file)
open_cv_image = np.array(pages[pg_num - 1])
Any thoughts on how to synchronise location in the tabula pdf vs location from the image exported?