Using the Document AI Processor to extract text from PDF (english, german, italian) works quite good, but sometimes the OCR mismatches. Especially in situations where the "word" is not a word from a dictionary, but has problems with part numbers which contain letters and digits quite mixed up ( O 0 L 1 5 S mostly). Is there a way to tell Document AI to use the text contained in the PDF (as text). To my knowledge Document AI uses the image of a PDF page to ocr the content.
Are there any flags to customize Document AI to use the text versions or any other ideas? I need to use Document AI because I want to have the structure of the text extracted in the right way.