-2

I am working on the program to extract text from the image. For that I tried with Tessaract and Ocropus libraries and I am able to convert simple plain text (black and white with simple font) from image to text string. For example:

Example of image 1

But I am not able to extract texts from complex image. Let's say for example from this image:

Example of image 2

Does anyone know how to achieve this? Is there any library available for extract text from complex images (with unpredictable different backgrounds? I will prefer Python, but language is not a bar.

Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126
  • Have you heard of captcha codes? The reason why they are used is that it is arbitrarily hard for computers to detect the text if it is distorted like in the logo you want to read. But I would start by trying to convert the logo into at least something similar to black on white text and see what goes from there ;) – meetaig Aug 31 '16 at 20:45
  • 1
    When you have the text as binary image you can try extracting the skeleton of the letters. This should work for this image anyway. But for all logos - no way. For example negative space can be really hard to teach to a computer – meetaig Aug 31 '16 at 20:46
  • Is there a way to make my program know about only the text and remove the noise (background)?. For example, remove everything from the background except the `Sprite`. After that it will be easier to transform it to the string. – Moinuddin Quadri Aug 31 '16 at 20:49
  • try segmenting by color ;) your text is white everything else is not – meetaig Aug 31 '16 at 20:49
  • @meetaig: But that will be for this special case. The tough part is that I want my program to read text from any kind of image i.e any text present within the image – Moinuddin Quadri Aug 31 '16 at 20:51
  • Have you read my other comments? As I said to have a general solution for ALL logos is very hard if not impossible – meetaig Aug 31 '16 at 20:52
  • You can probably try to train a recurrent neural network, though you would need a big training dataset and the success will greatly depend on how diverse the logos are. – Eli Korvigo Aug 31 '16 at 20:55
  • @meetaig: I know that, that is why I do not want to write the own algorithm for extracting text. I just wanted to get the idea from others regarding whether they know about any library or tool to achieve this. It doesn't matter in which language it is, as far as it accomplishes my requirement. – Moinuddin Quadri Aug 31 '16 at 20:55
  • @EliKorvigo: You mean to store some set of images as dataset, and match the new logo based on my available data sets? – Moinuddin Quadri Aug 31 '16 at 20:57
  • 1
    Probably not. As far as I understand, what you describe is something similar to k-nearest neighbours. That is not learning. An RNN actually learns to read new logos, that it hasn't seen before, but the method has its limits for sure. – Eli Korvigo Aug 31 '16 at 21:07

1 Answers1

3

How all this word recognition work is machine learning algorithm is fed a lot of images with already interpreted corresponding text. It learns to understand letters from different fonts and appearances it is given.

However, logos are made with very specific font. Almost no two logos use similar one. That makes it very hard, if not impossible, to create a learning data to recognize what is written.

That is possible to train algorithm to recognize Sprite trademark everywhere it sees it. For that, you'd need to use OpenCV and train it on Sprite logos of different qualities, pictures of logos of Sprite on stores, bottles, etc. That way, it will be able to see this particular logo (you also will need a dataset of non-Sprite logos, like Coke logo or picture of something completely irrelevant, like cat).

The reasons humans, unlike computers, can learn to recognize these things is because human brain is so much more powerful that the neural network you can create to understand that kind of things in computer. When computers will be as powerful as humans are in terms of computational capacity, re-ask this question and you will receive automatic answer from human-like machine.

Dmitry Torba
  • 3,004
  • 1
  • 14
  • 24