I am using google's tesseract OCR (https://github.com/gali8/Tesseract-OCR-iOS) to perform image to text conversion in my iOS app.
I'm able to scan and get the string using the following code.
let tesseract:G8Tesseract = G8Tesseract(language:"eng")
tesseract.delegate = self
tesseract.image = imageTaken // image taken from camera
tesseract.engineMode = .tesseractCubeCombined
tesseract.recognize()
print(tesseract.recognizedText)
It scans and retrieves the text in line by line approach. (Mixes the lines from other paragraphs). Like this.
Image 1
Now, How can I get texts as a block and read lines from each blocks separately. Like this.
Image 2
Things I tried.
- print(tesseract.recognizedBlocks(by: .block))
- print(tesseract.recognizedBlocks(by: .paragraph))
Still it mixes the lines from different paragraphs and considers the texts as a single line as shown in the image 1.
Any help will be appreciated. Thanks in advance.