0

I am using Tesseract for text recognition.

How can I simply recognize padding between text and create e.g. pdf or .doc file with the same padding?

Let's say that the source page contains 3 columns with some text (like a news paper). How can I recognize this text with appropriate padding and margin to each other and to page?

Maybe you can suggest example or library that does the same or just algorithm?

JRulle
  • 7,448
  • 6
  • 39
  • 61
Matrosov Oleksandr
  • 25,505
  • 44
  • 151
  • 277

0 Answers0