I'm using tesseract in a project that runs with docker-compose. I don't know how to configure a single processor core directly in my python file. I want to do this because there is slowness and over-consumption when you parallel Tesseract.
I found many similar topics but they only deal with how to configure OMP_THREAD_LIMIT on the command line. Here is how tesseract is configured in my python code :
__tesseract_config_without_dir = "--psm 3 --oem 1 --dpi 300"
TESSERACT_DATA = os.environ.get(
"TESSDATA_PREFIX", "/usr/share/tesseract-ocr/4.00/tessdata/"
)
__tesseract_config = (
__tesseract_config_without_dir
+ ' --tessdata-dir "{}"'.format(config.TESSERACT_DATA)
)
So I would like to add an option like 'OMP_THREAD_LIMIT=1' in my __tesseract_config but I don't know how to write it. In the tesseract documentation, we only find this informations :
"ENVIRONMENT VARIABLES
OMP_THREAD_LIMIT
If the tesseract executable was built with multithreading support, it will normally use four CPU cores for the OCR process. While this can be faster for a single image, it gives bad performance if the host computer provides less than four CPU cores or if OCR is made for many images. Only a single CPU core is used with OMP_THREAD_LIMIT=1."