2

i need to ocr small images with python 2.7, contains a price.

enter image description here

as you can see the image is very small , and contains some values.

my goal is to decode to : 654.10

i try with tesseract but i had no luck.

import pytesseract
print(pytesseract.image_to_string(Image.open('example.png') , lang='eng', boxes=False,config='--psm 10 --eom 3 -c tessedit_char_whitelist=€0123456789'))

i get :

€553 1

i try with an online converter and work like a charm (https://convertio.co/it/ocr/) so i think it will be possibile.

someone have a better idea ?

Thanks

(sorry for my bad english)

UPDATE :

i try to threshold the image without any luck ... again ...

import cv2
img = cv2.imread('cropped.png')
grayscaled = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
retval, threshold2 = cv2.threshold(grayscaled,125,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
cv2.imwrite('threshold.jpeg',threshold2)
print(pytesseract.image_to_string(Image.open('threshold.jpeg') , lang='eng', boxes=False,config='--psm 10 --eom 3 -c tessedit_char_whitelist=0123456789'))

output : 553 0 image output : enter image description here

ps. i cropped the original image deleting the € sign ... but still got the error.

Thanks

ilmetu
  • 448
  • 11
  • 27
  • Pre-processing the image to make the text clearer may help. Something similar to 'threshold' in photoshop. – Paandittya Jan 17 '18 at 23:07
  • @Paandittya unfortunately it doesn't work ... anyway ... another idea ? Thanks – ilmetu Jan 18 '18 at 22:19
  • so i was thinking on the lines of resizing the image to 2x-3x and then try to get text out of it. In process I landed upon this thread [Link](https://stackoverflow.com/a/4945131/7636315). If you have not tried this then give it a go. This seems like a similar problem as you are facing. Hope this would do the trick. – Paandittya Jan 19 '18 at 10:12
  • @Paandittya it works thanks a lot. – ilmetu Jan 20 '18 at 01:14
  • You are welcome mate :) – Paandittya Jan 20 '18 at 07:59

1 Answers1

3

I solve my issue by resize the image and then apply the threshold.

this code will increase the dimension of the image :

    basewidth = 300
    img = Image.open(saved_location)
    wpercent = (basewidth/float(img.size[0]))
    hsize = int((float(img.size[1])*float(wpercent)))
    img = img.resize((basewidth,hsize), Image.ANTIALIAS)
    img.save(saved_location)

Thanks to the user post in the comments.

ilmetu
  • 448
  • 11
  • 27