I have a string base64 image that need to convert so then I can read it as image to analyze with pytesseract:
import base64
import io
from PIL import Image
import pytesseract
import sys
base64_string = "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAUDBAQEAwUEBAQFBQUGBwwIBwcHBw8LCwkMEQ8SEhEPERETFh....."
img_data = base64.b64decode(base64_string)
img = Image.open(io.BytesIO(img_data)) # <== ERROR LINE
text = pytesseract.image_to_string(img, config='--psm 6')
print(text)
gives the error:
Traceback (most recent call last):
File "D:\aa\xampp\htdocs\xbanca\aa.py", line 14, in <module>
img = Image.open(io.BytesIO(img_data))
File "D:\python3.10.10\lib\site-packages\PIL\Image.py", line 3283, in open
raise UnidentifiedImageError(msg)
PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x000001A076F673D0>
I tried using numpy and request libraries but all have the same result.. and the base64 example image is working ok in any another converter.