8

The goal is to make an app which can recognize egg markings, for example 0-DE-134461. I tried both Tesseract and the Google Vision API on the following images. The results from both OCR engines are disastrous.

German Egg Spanish Egg

0-DE-46042

Tesseract → ""
Google Vision API → " 2 "

3-ES08234-25591

Tesseract → ""
Google Vision API → " Es1234-2SS ) R SHAH That is part "

Cropped

I manually cropped the images with Photoshop.

German Egg - Cropped Spanish Egg - Cropped

0-DE-46042

Tesseract → ""
Google Vision API → ""

3-ES08234-25591

Tesseract → "3ΓÇöE503ΓÇÿ234-gg"
Google Vision API → " -ESOT23-2559 ) "

Thresholded

I color-selected the text on both eggs manually with Photoshop and removed the background. German Egg - Thresholded Spanish Egg - Thresholded

0-DE-46042

Tesseract → "OΓÇöDEΓÇö46042"
Google Vision API → " O-DE-46042 "

3-ES08234-25591

Tesseract → ""
Google Vision API → " 3-ESO8234-9 "

Removing the circular warp?

I would assume that the last preprocessing step should be removing the circular warp, but I wouldn't know how to do that manually using Photoshop, let alone automating that.


My questions

  • Am I heading in the right direction?
  • Are my preprocessing steps correct?
  • What would be the approach to automate these steps in, say, OpenCV?

Extra info

The command I used to get the tesseract OCR results:

λ tesseract {egg_picture}.jpg --psm 7 stdout

The tesseract version:

λ tesseract --version
tesseract 4.0.0-alpha.20170804
 leptonica-1.74.4
  libgif 4.1.6(?) : libjpeg 8d (libjpeg-turbo 1.5.0) : libpng 1.6.20 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.4.3 : libopenjp2 2.1.

Platform: Windows 10


Edit 1

I applied adaptive thresholding on some egg marking images with OpenCV. These are the results so far:

IMG A (edit 1) IMG C (edit 1) IMG B (edit 1) IMG H (edit 1) IMG D (edit 1) IMG E (edit 1) IMG G (edit 1) IMG I (edit 1)

However, there's still lots of noise. I'm struggling to adjust the parameters so that it works well across different images.


Tomasito665
  • 1,188
  • 1
  • 12
  • 24
  • yes, I think you'll have to correct the circular warp. For color thresholding I would try HSV color-space, but not sure whether the red (font) and the orange (egg) are far away enough... – Micka Sep 01 '17 at 09:54
  • For correcting the circular wrap you can apply homography. You just have to map points lying on a curved line to a line parallel to the top-left point(for upper surface, similarly you can do for bottom surface). Now detecting the points lying on a curved surface from a binary image should not be a difficult task – Optimus 1072 Sep 01 '17 at 14:07
  • 1
    https://developers.googleblog.com/2017/09/how-machine-learning-with-tensorflow.html – rmtheis Oct 28 '17 at 13:15

1 Answers1

2

I have a suggestion.

I tried applying local histogram equalization for all the three channels in the BGR color space and then merged them.

Result:

enter image description here

enter image description here

With the details in the image more enhanced you can think about preprocessing on these images.

I also tried globally equalizing the histogram of the three channels separately. The images although clear than the original, lacked the depth in detail.

Jeru Luke
  • 20,118
  • 13
  • 80
  • 87
  • Thanks for the histogram equalization trick on the RGB channels. The eggcode definately stands out more. However, I've not been able to successfully binarize the text. Any idea? – Tomasito665 Sep 05 '17 at 11:16
  • I did not try anything yet. Try performing difference of gaussian with different kernel sizes or adaptive thresholding. – Jeru Luke Sep 05 '17 at 11:58
  • I've tried adaptive thresholding, but there's lots of binary noise. I've added images on the question of the results. – Tomasito665 Sep 07 '17 at 14:08