0

I'm doing pdftotext -bbox file.pdf and that produces word-level output. Is there a way to output coordinates on the character/phrase/line/block level?

I'm interested in knowing if either the poppler or xpdf version of pdftotext can do this.

Uwe Keim
  • 39,551
  • 56
  • 175
  • 291

1 Answers1

0

Sure, just use pdftotext -bbox-layout and it will give you the structure you need.

geisterfurz007
  • 5,292
  • 5
  • 33
  • 54
DaviRod
  • 143
  • 1
  • 6
  • 1
    but even if the pdf has multiple pages 'pdftotext -bbox-layout' gives layout for 1st page only. any way to get it for all the pages? – Jaydeep Feb 05 '21 at 04:20