-2

I have an image file which Python reads and converts that to hexadecimal. The problem here is, even if I give an empty blank image its giving hexadecimal numbers as output. I need Python to process only the alphabets in the image and covert them to hexadecimal and give that as output.

Here is the program which I tired

import binascii

filename = 'a.png'
with open(filename, 'rb') as f:
    content = f.read()

print(binascii.hexlify(content))
m00am
  • 5,910
  • 11
  • 53
  • 69
Fz Arjun
  • 3
  • 3
  • 2
    Your program will give you hex codes of the image file. If you see an image file that is 100000 bytes in size, you will get 200000 hexadecimal digits (two per byte). It has nothing to do with what is shown on the image. The only way you would get no output is if the file was empty (0-length), and such a file cannot be said to be an image file. On the other hand if you want to _read_ the letters shown on the image, you need to use an OCR library (or code up an OCR from a machine learning library), and `binascii.hexlify` is an entirely wrong tool for the job. – Amadan Aug 13 '18 at 10:54

1 Answers1

2

This is OCR(Optical Character Recognition) problem, which is discussed several times in stack history.

Pytesserect do this in ease.

Usage:

import pytesserect
from PIL import Image

# Get text in the image
text = pytesseract.image_to_string(Image.open(filename))

# Convert string into hexadecimal
hex_text = text.encode("hex")