The paper 'Metric-based No-reference Quality Assessment of Heterogeneous Document Images', discusses about measuring the quality of characters in a document image. I'm having difficulty to understand the white speckle metric in page 7.
Small white speckle measures how much fattened character strokes have shrunken existing white connected components inside characters, or have created new ones by connecting strokes of characters. A histogram of white connected components in a document image is computed, and we have already found the most frequent font size.Then the white speckle is computed by summing up the histogram bins between 1 pixel and 1% of font size squared. The sum is then normalized by dividing by the area under the histogram between 1 and font size squared.
My questions are:
- How a histogram of white connected components in a document image is computed?
- How a white speckle is computed by summing up the histogram bins between 1 pixel and 1% of font size squared? Lets say for example, the most frequent font size is 32, so I have to sum up the frequencies from histogram bin 1 to one percent of 32^2 (1024)? Is that right?
- Honestly, I dont see any relation of computing or summing up the histogram bins between 1 pixel and 1% of font size squared to the small
white speckle measure
. Can you help me see the relation?
Thanks.