1

It seems that I need some advice on segmenting connected characters (see the image below).

As you can see, C and U, as well as 4,9 and 9 are connected and therefore when I try to draw contours they are joined into one block. Unfortunately, there are plenty of such problematic images so I think I need to find some solution.

I have tried using different morphological transforms (erosion, dilation, opening), but that doesn't solve the problem.

Thanks in advance for any recommendations.

enter image description here

Don Draper
  • 463
  • 7
  • 21

1 Answers1

1

It seems to me that the best solution will be to work on the preprocessing, if there is a possibility.

Otherwise, you can try Machine Learning techniques. You may get inspiration from Viola-Jones or Histograms of Oriented Gradients + SVM algorithms (even though those algorithms solve a problem that differs from Optical Character Recognition, I had plenty of insights from them). In other words, try "sliding" a window along a horizontal of predefined aspect ratio and recognize characters. But the problem may be that you will need to train a model, which may require a lot of data.

As I said earlier, it may be a good idea to reconsider the image preprocessing step. By the way, it seems that in the case of "C" and "U", erosion may help.

Good luck!:)

Bolat Tleubayev
  • 1,765
  • 3
  • 14
  • 16
  • Thanks for answering. The question is what kind of preprocessing should be used to make characters thinner. Overall, the Otsu thresholding works great with the only exception that all the characters (as well as the noise in between) get larger thus grouping a few characters into one blob. I know that different combinations of opening and closing might help in this case, but I haven't managed to separate the characters. Another option is to use adaptive thresholding, but in this case the threshold value is impossible to choose so that it could work on the majority of images. – Don Draper Apr 24 '19 at 13:30
  • @DonDraper you are more than welcome, could you please edit your question and add the original image to it? and then let me know here, in comments. If it is a dark text on bright surface or vice versa, I think simple threshold with some morphological operations should work, but again it would be better if you provide original image – Bolat Tleubayev Apr 25 '19 at 03:08