iOS - How to recognise texts as a block using tesseract OCR

Asked Nov 12 '18 at 12:07

Active Nov 12 '18 at 12:07

Viewed 625 times

I am using google's tesseract OCR (https://github.com/gali8/Tesseract-OCR-iOS) to perform image to text conversion in my iOS app.

I'm able to scan and get the string using the following code.

let tesseract:G8Tesseract = G8Tesseract(language:"eng")
        tesseract.delegate = self
        tesseract.image = imageTaken // image taken from camera
        tesseract.engineMode = .tesseractCubeCombined  
        tesseract.recognize()  
        print(tesseract.recognizedText)

It scans and retrieves the text in line by line approach. (Mixes the lines from other paragraphs). Like this.

Image 1

Now, How can I get texts as a block and read lines from each blocks separately. Like this.

Image 2

Things I tried.

print(tesseract.recognizedBlocks(by: .block))
print(tesseract.recognizedBlocks(by: .paragraph))

Still it mixes the lines from different paragraphs and considers the texts as a single line as shown in the image 1.

Any help will be appreciated. Thanks in advance.

asked Nov 12 '18 at 12:07

Tester

iOS - How to recognise texts as a block using tesseract OCR

0 Answers0