3

I'm working with Tesseract on Android, and I have the following code to extract the string and the boxes read from an image:

TessBaseAPI baseApi = new TessBaseAPI();
baseApi.init(tess_path, "eng"); 
baseApi.setImage(bitmap);
String recognizedText = baseApi.getUTF8Text();
ArrayList<Rect> boxes = baseApi.getCharacters().getBoxRects();
Pixa pixa = baseApi.getCharacters();
baseApi.end();

Here I can see the text and the boxes of each character, but sometimes the text has a different size than the boxes array, then it is impossible to set the box with the character read.

Is there any way to obtain the exact box and its char?

martijno
  • 1,723
  • 1
  • 23
  • 53
user2021731
  • 53
  • 1
  • 4

1 Answers1

3

Use a ResultIterator instead of getCharacters():

// Iterate through the results.
final ResultIterator iterator = baseApi.getResultIterator();
String lastUTF8Text;
float lastConfidence;
int count = 0;
iterator.begin();
do {
    lastUTF8Text = iterator.getUTF8Text(PageIteratorLevel.RIL_WORD);
    lastConfidence = iterator.confidence(PageIteratorLevel.RIL_WORD);
    count++;
} while (iterator.next(PageIteratorLevel.RIL_WORD));
rmtheis
  • 5,992
  • 12
  • 61
  • 78
  • but the iterator doesn't provide boxes (`Pixa` on which one can call `getBoxRects()`). I noticed `getCharacters` got deprecated in tess-two, but it is the only way to get character level box info... – martijno Nov 07 '13 at 20:34
  • You can use `getBoundingBox` or `getBoundingRect` on the iterator. – rmtheis Jun 25 '15 at 00:54
  • Hi,I am using this project https://github.com/rmtheis/android-ocr and need to get the small text(Bigger text is scanning), accurately from the image.I am stuck here.The image may contain tables, spaces or image may contain bills like bank generated slips.I need to get the small text from bills.Please Help me ASAP.Thanks in Advance. – Naveen Aug 03 '16 at 07:28
  • @Naveen If you have a specific question, please create a new question on StackOverflow and include all relevant details and sample images. – rmtheis Aug 03 '16 at 19:18
  • Hi,I cannot ask a question I dont know why. I need to scan small text from the hard copy(hard copy like current bills,etc...) by using this https://github.com/rmtheis/android-ocr Project. How to scan smaller text ?Please help me ASAP.Thanks in Advance. – Naveen Aug 08 '16 at 05:41