-1

I'm developing a handwriting recognition project. one of the requirements of this project is getting an image input, this image only contains some character object in a random location, and firstly I must extract this characters to process in next step.

Now I'm confusing a hard problem like that: how to extract one character from black/white (binary)image or how to draw a bound rectangle of a character in black - white (binary) image?

Thanks very much!

Roham Rafii
  • 2,929
  • 7
  • 35
  • 49
nguyen
  • 171
  • 6
  • 13

3 Answers3

2

If you are using MATLAB (which I hope you are, since it it awesome for tasks like these), I suggest you look into the built in function bwlabel() and regionprops(). These should be enough to segment out all the characters and get their bounding box information.

Some sample code is given below:

%Read image
Im = imread('im1.jpg');

%Make binary
Im(Im < 128) = 1;
Im(Im >= 128) = 0;

%Segment out all connected regions
ImL = bwlabel(Im); 

%Get labels for all distinct regions
labels = unique(ImL);

%Remove label 0, corresponding to background
labels(labels==0) = [];

%Get bounding box for each segmentation
Character = struct('BoundingBox',zeros(1,4));
nrValidDetections = 0;
for i=1:length(labels)
    D = regionprops(ImL==labels(i));
    if D.Area > 10
        nrValidDetections = nrValidDetections + 1;
        Character(nrValidDetections).BoundingBox = D.BoundingBox;
    end
end


%Visualize results
figure(1);
imagesc(ImL);
xlim([0 200]);
for i=1:nrValidDetections
    rectangle('Position',[Character(i).BoundingBox(1) ...
                          Character(i).BoundingBox(2) ...
                          Character(i).BoundingBox(3) ...
                          Character(i).BoundingBox(4)]);

end

The image I read in here are from 0-255, so I have to threshold it to make it binary. As dots above i and j can be a problem, I also threshold on the number of pixels which make up the distinct region.

The result can be seen here: https://www.sugarsync.com/pf/D775999_6750989_128710

Vidar
  • 4,141
  • 5
  • 24
  • 30
1

The better way to extract the character in my case was the segmentation for histogram i only can share with you some papers.

http://cut.by/j7LE8

http://cut.by/PWJf1

may be this can help you

0

One simple option is to use an exhaustive search, like (assuming text is black and background is white):

  1. Starting from the leftmost column, step through all the rows checking for a black pixel.
  2. When you encounter your first black pixel, save your current column index as left.
  3. Continue traversing the columns until you encounter a column with no black pixels in it, save this column index as right.
  4. Now traverse the rows in a similar fashion, starting from the topmost row and stepping through each column in that row.
  5. When you encounter your first black pixel, save your current row index as top.
  6. Continue traversing the rows until you find one with no black pixels in it, and save this row as `bottom.

You character will be contained within the box defined by (left - 1, top - 1) as the top-left corner and (right, bottom) as the bottom-right corner.

aroth
  • 54,026
  • 20
  • 135
  • 176
  • it's greate ideas for image contain only one character , but i think it seem to be not work correctly with image contain more one character with some differrence location in image . – nguyen Aug 07 '11 at 13:57
  • 1
    Well yes, your post only mentioned one character. If you need to locate multiple characters, see [this post](http://codethink.no-ip.org/wordpress/archives/150) for one possible solution. – aroth Aug 07 '11 at 22:04
  • sorry aroth , BUT your solution is very helpful for me in another task.Thanks aroth:) – nguyen Aug 08 '11 at 12:00