4

I am currently trying to find out how I can find out the font size from an image, using tesseract ocr or maybe something else within Python.

My current image here: Image 1. Within the image, at the top I know for certain it is a font 6 and the bottom is font 7.

I am starting out a side project of scanning the image and seeing if it has a minimum legal font requirement (which is a font 7).

How can I determine whether all text within the image is at font 7 and not below 7?

Here is what I'm thinking to do:

legal_font = 7

if legal_font > 6:
    print("Illegal")
else:
    print("Legal")

the number 6 is the one that will wary, due to loads of text around the image.

Ammar
  • 73
  • 6
Oct2020
  • 49
  • 1
  • 3
  • 1
    Possible duplicate of [get Font Size in Python with Tesseract and Pyocr](https://stackoverflow.com/questions/39324626/get-font-size-in-python-with-tesseract-and-pyocr) – Phoenix Oct 10 '19 at 07:43
  • 1
    Have looked into it, but if you look at comments from the answer, it is not working no more – Oct2020 Oct 10 '19 at 07:52
  • opencv + TensorFlow maybe? – Arca Ege Cengiz Feb 23 '21 at 16:11
  • 1
    Is it for windows 10, or some other operating system? – Bob Feb 23 '21 at 17:29
  • Yes @user12750353, it is for windows 10. – Tony Stark Feb 23 '21 at 17:42
  • Could you attach a sample image? – Bob Feb 23 '21 at 17:45
  • Your imeage is blurry and has small resolution. Probably out of the range of interest of most OCR. If it can be provided at a higher resolution it will be better. Otherwise we can determine the font size without OCR by simply detecting bounding boxes. http://195.148.30.97/cgi-bin/ocr.py – Bob Feb 24 '21 at 10:24
  • The text will be aligned to the image axes as in the example? or it needs to work with rotated text? – Bob Feb 24 '21 at 11:06

1 Answers1

3

This seems like a valid answer to your question:

from PIL import ImageFont, ImageDraw, Image

def find_font_size(text, font, image, target_width_ratio):
    tested_font_size = 100
    tested_font = ImageFont.truetype(font, tested_font_size)
    observed_width, observed_height = get_text_size(text, image, tested_font)
    estimated_font_size = tested_font_size / (observed_width / image.width) * target_width_ratio
    return round(estimated_font_size)

def get_text_size(text, image, font):
    im = Image.new('RGB', (image.width, image.height))
    return draw.textsize(text, font)

width_ratio = 0.5
font_family = "arial.ttf"
text = "Hello World"

image = Image.open('pp.png')
editable_image = ImageDraw.Draw(image)
font_size = find_font_size(text, font_family, image, width_ratio)
font = ImageFont.truetype(font_family, font_size)
print(f"Font size found = {font_size} - Target ratio = {width_ratio} - Measured ratio = {get_text_size(text, image, font)[0] / image.width}")

editable_image.text((10, 10), text, font=font)
image.save('output.png')

You do have to install the PIL package tho, comment here if you need help doing that.

You will also have to download this: https://github.com/JotJunior/PHP-Boleto-ZF2/blob/master/public/assets/fonts/arial.ttf?raw=true

Mohit Reddy
  • 133
  • 8
  • Thanks for the code. I tried your code but this code just adds a new text "Hello World" to my existing image and shows its size and other information. I was expecting the code would rather show the text sizes already present in the image. Can you please clarify on how we can use this code to get the text size from an image? Thank you. – saurav.rox Jan 29 '22 at 16:38
  • Il make a update for production, sorry for not in forming of this before – Mohit Reddy Feb 04 '22 at 06:41