1

Reading a paper, I'm having difficulty understanding the algorithm described:

Given a black and white digital image of a handwriting sample, cut out a single character to analyze. Since this can be any size, the algorithm needs to take this into account (if it will be easier, we can assume the size is 2^n x 2^m).

Now, the description states given this image we will convert it to a 512-bit feature (a 512-bit hash) as follows:

  1. (192 bits) computes the gradient of the image by convolving it with a 3x3 Sobel operator. The direction of the gradient at every edge is quantized to 12 directions.

  2. (192 bits) The structural feature generator takes the gradient map and looks in a neighborhood for certain combinations of gradient values. (used to compute 8 distinct features that represent lines and corners in the image)

  3. (128 bits) Concavity generator uses an 8-point star operator to find coarse concavities in 4 directions, holes, and lagrge-scale strokes.

The image feature maps are normalized with a 4x4 grid.

I'm for now struggling with how to take an arbitrary image, split into 16 sections, and using a 3x3 Sobel operator to come up with 12 bits for each section. (But if you have some insight into the other parts, feel free to comment :)

Jon Seigel
  • 12,251
  • 8
  • 58
  • 92
pithyless
  • 1,659
  • 2
  • 16
  • 31
  • I think you have to specify your question more. Is the problem splitting the image, or doing the sobel kernel convolution? – Hannes Ovrén Aug 06 '09 at 06:43
  • Are you asking us to help you understand a paper on handwriting recognition or are you asking us to do your homework for you? You see, normally papers include the solution so there is no need to struggle searching for that. – jilles de wit Aug 10 '09 at 11:50
  • Srihari et al. (2002). As mentioned by others, the implementation details were sparse. It has been a long time since I posted this question. I may dig it up and have another stab at it. – pithyless Jun 29 '10 at 11:09

2 Answers2

3

I'm struggling with the same paper by Srihari et al. (2002) for my Ph.D. thesis. I would say this text is not very specific, but the authors refer to a technical report (CEDAR-TR-01-1) for more details. This report seems not to be accessible on internet, so my suggestion is to contact the authors by e-mail and ask for this report. If something is not clear, you could ask for clarification as well.

user201626
  • 31
  • 2
2

I see the question is very old but maybe it can help someone - you apply sobel operators for horizontal and vertical edge detection on an image. From the result you can calculate image direction vectors for every point of an image. In your case you need to map these vectors to 12 directions. Then, in your case, you divide the image into 4x4 subimages (16 sections) and calculate intensity of each direction in each section. This gives you 12*16=192 features. I can give more detailed explanation if needed.

Mika
  • 153
  • 1
  • 7