1

I have a Django app which is deployed in Heroku. I'm trying to read text from image using pytesseract .I can run this app in localhost without problem but in heroku its showing an error Error opening data file /app/vendor/tesseract-ocr/tessdata/eng.traineddata even after I added pytesseract buildpacks as mentioned here

def ocr(serializer):
    imgObject = ImageModel.objects.get(id=serializer.data['id'])
    imgPath = (os.path.join(settings.MEDIA_ROOT, imgObject.image.name))
    InputFile = str(imgPath).replace("\\", "/")
    pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
    return pytesseract.image_to_string(Image.open(InputFile))

1 Answers1

0

It looks like this line:

pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'

Is expecting to find a binary to use to perform the image manipulation. This binary won't exist on Heroku. Maybe the buildpack already handles this part of the configuration. Have you tried commenting out this line to see if it will work?

Nathan Loyer
  • 1,339
  • 10
  • 20