0

From the following image, I want to extract number below text Arzt-Nr (654321161).
I've used OCR reader but it is extracting texts randomly not in a sequence, making it difficult to add a logic to extract no below "Arzt-Nr".

I've used following code but texts are not in sequence.
Is there any way to achieve this?

 String text = "";
            for (int i = 0; i < detectedItems.size(); i++) {
                TextBlock item = detectedItems.valueAt(i);
                String detectedText = item.getValue();
                List<Line> lines = (List<Line>) item.getComponents();
                for (Line line : lines) {
                    List<Element> elements = (List<Element>) line.getComponents();
                    for (Element element : elements) {
                        String word = element.getValue();
                        text = text + " " + word;

                    }
                    text += "\n";
                }
            }

enter image description here

Ragini
  • 765
  • 1
  • 11
  • 29

2 Answers2

0

Try to check a fixed length to the words after "Arzt-Nr" position, try also to check the pattern of the word founded.. for example if you need only numbers ecc...

0

Extract tsv output of image using tesseract and find the nearest text below the location of keyword. Also have a look at page segmentation modes of tesseract.

Link to Generating tsv Link to use page segmentation