Im building a parser algorithm on images. tesseract not giving accuracy. so im thinking to build a CNN+LSTM based model for image to text conversion. is my approach is the right one? can we extract only the required string directly from CNN_LSTM model instead of NLP? or you see any other ways to improve tesseract accuracy?
Asked
Active
Viewed 43 times
1 Answers
0
NLP is used to allow the network to try and "understand" text. I think what you want here is to see if a picture contains text. For this, NLP would not be required, since you are not trying to get the network to analyze or understand the text. Instead, this should be more of an object detection type problem.
There are many models that do object detection. Some off the top of my head are YOLO, R-CNN, and Mask R-CNN.

asong24
- 33
- 6