6

Is it possible to get font size from an image using pyocr or Tesseract? Below is my code.

tools = pyocr.get_available_tools()
tool = tools[0]
txt = tool.image_to_string(
      Imagee.open(io.BytesIO(req_image)),
      lang=lang,
      builder=pyocr.builders.TextBuilder()
)

Here i get text from image using function image_to_string . And now, my question is, if i can get font-size(number) too of my text.

Witcher
  • 63
  • 1
  • 1
  • 5

1 Answers1

3

Using tesserocr, you can get a ResultIterator after calling Recognize on your image, for which you can call the WordFontAttributes method to get the information you need. Read the method's documentation for more info.

import io
import tesserocr
from PIL import Image

with tesserocr.PyTessBaseAPI() as api:
    image = Image.open(io.BytesIO(req_image))
    api.SetImage(image)
    api.Recognize()  # required to get result from the next line
    iterator = api.GetIterator()
    print iterator.WordFontAttributes()

Example output:

{'bold': False,
 'font_id': 283,
 'font_name': u'Times_New_Roman',
 'italic': False,
 'monospace': False,
 'pointsize': 9,
 'serif': True,
 'smallcaps': False,
 'underlined': False}
sirfz
  • 4,097
  • 23
  • 37