0

I'm doing a DIP project. I want to count the total number of words in each paper using Image Processing.

The original image is:

Original image

I did some pre-processing and produced the image below: Pre-processed image

My idea to count the total number of words in each paper is to detect the digits inside blobs.

So please guide me. how can I count the words in this image? What's your idea?

Thanks.

2 Answers2

1

Using the Digits inside blobs/circles is a good problem definition. I would recommend doing a circle hough transform and only looking for circles of a certain radius and then count the number of circles detected. You'll have to figure out what your radius is in pixels but this might be a good starting point. Good luck

andrew
  • 2,451
  • 1
  • 15
  • 22
1

If all pages are somewhat cleanly separated with one definition per line, you could take a very simple approach of counting the filled lines. First detect the list on the page to ignore irrelevant markings (green box) - does not have to exactly detect the edge so long as the bounds are no bigger than the list.

Then look for horizontal lines of pixels with no marking on them, or no dark value greater than X darkness. This is illustrated below with the pink horizontal lines. Lastly count the filled lines (any discrete section of horizontal lines which is not empty) and you have your number of definitions.

enter image description here

Zerp
  • 874
  • 2
  • 8
  • 18