iOS recognize text and text separators

Asked Feb 06 '15 at 13:31

Active Feb 06 '15 at 14:34

Viewed 105 times

I am using Tesseract for text recognition.

How can I simply recognize padding between text and create e.g. pdf or .doc file with the same padding?

Let's say that the source page contains 3 columns with some text (like a news paper). How can I recognize this text with appropriate padding and margin to each other and to page?

Maybe you can suggest example or library that does the same or just algorithm?

edited Feb 06 '15 at 14:34

JRulle

7,448
6
39
61

asked Feb 06 '15 at 13:31

Matrosov Oleksandr

25,505
44
151
277

Did you try to use tesseracts hOCR output? – tobltobs Feb 12 '15 at 12:49
@tobltobs no I will try it right now – Matrosov Oleksandr Feb 12 '15 at 13:32
@tobltobs the idea is to use char *boxtext = _tesseract->GetBoxText(0);? – Matrosov Oleksandr Feb 12 '15 at 13:33

iOS recognize text and text separators

0 Answers0