this is my first question here, so although I'll try my best to ask the question correctly, please have patience with me. I'm trying to run an OCR with Tesseract with Django on my server at some server (pythonanywhere, if it's important in any way), but I keep having this error:
pytesseract.pytesseract.TesseractError: (1, 'Tesseract Open Source OCR Engine v3.04.01 with Leptonica
Error opening data file /usr/share/tesseract-ocr/tessdata/heb.traineddata Please make sure the
TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed
loading language \'heb\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')
So, at first, I thought I could just move the correct "tessdata" file (which exists on my server) into /usr/share/bin... but I couldn't do that without a root user. no matter what I tried in the Bash shell, I don't have access to the root user (although I was never asked to implement one). I cannot use the "sudo" command that I see so often, I guess it's because it's not a valid command in Bash shell (or Unix, I'm not sure how to refer to it). I guess I have a root user named "Orikle", but no matter what, I couldn't manage to find a correct password (tried the pythonanywhere password for my account, and the Django superuser password (yeah, I know it was wishfull-thinking)).
After giving up on that method, I saw that the error mentioned that the TESSDATA_PREFIX environment variable can be set. so then I STFW and found out how to create shell and env variables and indeed I created them, but to no avail. when I enter the console and type printenv
I can see
TESSDATA_PREFIX=/home/Orikle/.virtualenvs/myenv/bin/Tesseract-OCR
so that led me to believe that I really managed to make it work, but alas, I keep getting the same error as before.
Just to make clear, I tried the parent directory, I tried the exact directory, and maybe every other directory out there. Any help would be appreciated. Thanks.