easyocr UnicodeEncodeError: 'ascii' codec can't encode character

Question

I wrote a script which uses easyocr for extracting texts from some images. I exposed this scripts with an API. when I make an API request to this script, it gives an error while creating easyocr instance.

the code I am getting error for

import easyocr
ocr = easyocr.Reader(['en'])

Error

Traceback (most recent call last):
File \"./gift_card_ocr/gift_card_ocr_controller.py\", line 52, in gift_card_ocr_master
  detector = OCR()
File \"./gift_card_ocr/gift_card_ocr.py\", line 84, in __init__
  self.reader = easyocr.Reader(['en'])
File \"/opt/miniconda3/envs/lambda_api/lib/python3.8/site-packages/easyocr/easyocr.py\", line 90, in __init__
  download_and_unzip(detection_models[detector_model]['url'], detection_models[detector_model]['filename'], self.model_storage_directory, verbose)
File \"/opt/miniconda3/envs/lambda_api/lib/python3.8/site-packages/easyocr/utils.py\", line 586, in download_and_unzip
  urlretrieve(url, zip_path, reporthook=reporthook)
File \"/opt/miniconda3/envs/lambda_api/lib/python3.8/urllib/request.py\", line 283, in urlretrieve
  reporthook(blocknum, bs, size)
File \"/opt/miniconda3/envs/lambda_api/lib/python3.8/site-packages/easyocr/utils.py\", line 686, in progress_hook
  print(f'\\r{prefix} |{bar}| {percent}% {suffix}', end='')\nUnicodeEncodeError: 'ascii' codec can't encode character '\█' in position 12: ordinal not in range(128)\n"

But when i tried the same scripts without API request on the same environment, it works fine.

I tried researching about this issue and tried

$ export PYTHONIOENCODING=utf8

but it didn't worked.

FYI
For the API i used flask with nginx. I don't thing the error has to do anything with these. Also the easyocr version i am using 1.5.0 and python version is 3.8

`ascii` refers to the 7-bit US-ASCII character set and can't even encode all English words. The error says so explicitly. Python 3 strings are Unicode already. `export PYTHONIOENCODING=utf8` shouldn't be needed — Panagiotis Kanavos, Aug 09 '22 at 13:55
The call stack shows that the error occurred as EasyOCR tried to download and unpack *its own models*. Have you tried creating a simple two-line script to run the same code? — Panagiotis Kanavos, Aug 09 '22 at 13:59
@PanagiotisKanavos yes, i tried running that two line code separately on same environment and it didn't gave that error — Darkstar Dream, Aug 09 '22 at 14:01
The call stack starts at `detector = OCR()`. What does *that* do? As for the environment, are you sure something else isn't hard-coding `ascii` somewhere? Some other environment variable or Flask setting? If the code runs in a script but not a web app, there's definitely a difference in the environment. Running inside a web app is a big difference already — Panagiotis Kanavos, Aug 09 '22 at 14:09
Try creating a minimal application and start adding code little by little until you can reproduce the problem — Panagiotis Kanavos, Aug 09 '22 at 14:12
BTW without looking at the EasyOCR code, just `easyocr/utils.py\", line 686, in progress_hook` and `f'\\r{prefix} |{bar}| {percent}% {suffix}', end='')\nUnicodeEncodeError: 'ascii' codec can't encode character '\█'` look like an attempt to print a progress bar using block characters to the terminal. — Panagiotis Kanavos, Aug 09 '22 at 14:14
I created a class for handling images that also load `easyocr`. And that class is loaded by `detector=OCR()`. And about the environment i don't know how it is being manipulated internally. It is using nginx, could nginx affect easyocr — Darkstar Dream, Aug 09 '22 at 14:15
Are you running on Windows or Linux? `print` *is* affected by `PYTHONIOENCODING`. On Windows you need to use `SET` instead of `export` to set environment variables. On the other hand, printing to the terminal from a web app is useless and causes delays — Panagiotis Kanavos, Aug 09 '22 at 14:20
If you check the [download_and_unzip](https://github.com/JaidedAI/EasyOCR/blob/ea2db54101f83bc96637382228611120654f1dbd/easyocr/utils.py#L583) signature, printing is enabled only when `verbose` is true - which is the default. You need to *disable* verbose logging to both avoid the error and improve performance — Panagiotis Kanavos, Aug 09 '22 at 14:22

easyocr UnicodeEncodeError: 'ascii' codec can't encode character

0 Answers0