I'm using pytesseract
to read text from images, but the text is rotated and there is a light source that creates shadows. The code rotates the image half a degree each time expecting a match but some of the images I provide (all of them are quite similar) don't even output any text when Tesseract analyzes them.
So far what I've done is lowering the contrast to 0, sharpening the image and cropping it (I don't know if that last step helps. Sometimes I read the non-cropped image and sometimes I read the cropped one, I'll leave the cropping on the code just in case). The code I'm currently using is this one:
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
image = Image.open("Image.png")
converter = ImageEnhance.Color(image)
imagedesat = converter.enhance(0)
imagedesat = imagedesat.filter(ImageFilter.SHARPEN)
cropped = imagedesat.crop((0,80,211,186))
text = pytesseract.image_to_string(cropped, lang='eng')
cropped.save('Result.png')
rotated = cropped.rotate(0)
while text == "":
rotated = rotated.rotate(-0.5)
text = pytesseract.image_to_string(rotated, lang='eng')
print(text)
The output is this one:
For the last cleaning step, I've tried this method but it either deletes parts of the text or it merges it with shadows from the book
I've also tried the top response to this question but there is an "out of range" error so I have been unable to replicate it
from PIL import Image
from PIL import ImageFilter
im = Image.open(r'c:\temp\temp.png')
white = im.filter(ImageFilter.BLUR).filter(ImageFilter.MaxFilter(15))
grey = im.convert('L')
width,height = im.size
impix = im.load()
whitepix = white.load()
greypix = grey.load()
for y in range(height):
for x in range(width):
greypix[x,y] = min(255, max(255 * impix[x,y][0] / whitepix[x,y][0], 255 * impix[x,y][2] / whitepix[x,y][3], 255 * impix[x,y][4] / whitepix[x,y][5]))
When I replace all the variables accordingly I get an error in the last line
Edit: The loop was not working as it always read the unrotated image so I've fixed it. Despite that it still gives an incomprehensible block of text as an output
Edit 2: I've included the code that gave an error