0

I'm a bachelor student and currently working on a final project in Optical Braille Recognition using a real-time camera. I've successfully processed the image into HSV format and extracted only the value of the HSV image to prevent the ambient light effecting the image, performed the binary threshold,canny edge detection, erode, and dilation for getting only the Braille dots from the camera.

What I would like to ask is how to perform a segmentation in a problem where the distance between each dots always change as the camera move nearer or further to the Braille writings?

Any assistance would be appreciated. Thank you

anarchy99
  • 993
  • 4
  • 14
  • 20

3 Answers3

0

To do this, you would detect some sort of relative pair of coordinates that would allow you to detect the "scale" of the braille writing in your image. This can be an identifying pair of points on either end of the writing, or even just some characteristic dots. With the scale you can transform the image to be of uniform size, depending on what distance away the camera is.

Christian Stewart
  • 15,217
  • 20
  • 82
  • 139
  • I think the scale factor of each relative pair of the Braille dots coordinates would be different, because whenever you hold an camera, there is any possibility the camera would not perfectly parallel. I mean there would be chances that there will be any skewness in x,y,and z axis when holding the camera. – anarchy99 Feb 28 '13 at 17:43
0

There is no simple, general solution to your problem. Surely if I do not immediately understand how these Braille letters are spaced out, it will not be easily solved by a simple algorithm.

enter image description here

Your best bet is to read literature on Braille text, talk with your prof, and have a blind person explain to you how they read Braille.

Other than that, you would have to find the baselines of the Braille text lines and see how they differ, then run a cvPerspectiveTransform in order to straighten out the image, so you can segment the dots without considerations for perspective.

Boyko Perfanov
  • 3,007
  • 18
  • 34
  • I'd think the angle of the lines could be detected, no? You could probably detect vertical and horizontal aligned dots and then perspective transform by that – Christian Stewart Feb 28 '13 at 18:43
  • Yes, but it's not a trivial job, especially seeing that the program may miss the basis by exactly 45 degrees and construct the wrong grid – Boyko Perfanov Feb 28 '13 at 18:48
  • If you can see the grid, though, there are very clear lines in either direction. 45* wouldn't happen, as those don't seem to ever line up perfectly. – Christian Stewart Feb 28 '13 at 18:50
0

This challenge is very similar to the issues I've encountered in my barcode system. My answer is a generalized description of the method I use.

I'd start by dividing the image into a grid, where a single character cell would fit within a single grid cell. This would make it that any character would be guaranteed to fit within a 2x2 grid cell, no matter how the grid overlays the image.

Convert the image into dots. Dots are identified by local identification using a small area of pixels.

assign each dot a grid cell number. this should be something easy like x/y location divided by 32 pixels cell ((y/32)*(width/32))+(x/32)

Keep a count of dots per grid cell and when all the dots are identified, sort the dot table by grid number and build an index by displacement in the table and number of elements.

If the resolution varies, sample some cells with lots of dots to determine distance between cell pairs.

Look though the cells row by row, but examine each cell using a 2x2 cell group. This way, any dot in the cell being tested, is guaranteed to be matched to a paired dot (if one exists). By using the grid dots only need to be matched to dots local to each other, so while the image may have thousands of dots, individual dots only need to try to match to 1-10 dots.

Pairing dots will create duplicates, which can either be prevented while matching or purged later.

At this point is where you would need to match the dots to Braille. Horizontal pairs of pairs and vertical pairs of pairs should be able to start lining up the Braille text.

Once the lines are aligned, the speck table would then be rotated into the text alignment determined. The pairs would be put into alignement, then from the position of the pair, unmatched specks could be added by matching the grid location of the pair to unpaired dots in the dot table.

Fred F
  • 1,027
  • 1
  • 9
  • 18
  • I got some of your points, but I'm still confused about why must the location be divided by 32 pixels? Is it a standard distance between each dots of a Braille writings in image processing? In your past project in Barcode System, could the fixed distance (ex. 32 pixel) successfully read the barcode whether it's near or far from the camera? – anarchy99 Mar 01 '13 at 06:01
  • the 32 pixel is just an arbitrary grid size. For the image displayed, the grid size would be more like 64 As a general size, one grid cell should be able to hold one character. This way in 2x2 grid squares, you can be sure that one character will always be able to fit, yet not have so many dots that when you try to pair up the dots that there are not too may permutations. At 64 pixels, up to 9 dots could fit in one grid, so a max of 36 dots would need to be permuted. – Fred F Mar 01 '13 at 08:05
  • For braille, the closest pairing of dots would determine resolution. For my needs, distortion tolerance was as important as resolution variability. This restricted resolution variability to a range of resolution where a pair needed to be within a range. So at a base resolution of 320 dpi, i read a range of 240-400 dpi, within a single image. Processing gets confusing when the spacing between 2 dots at high res is greater than the space between 3 dots at low resolution. – Fred F Mar 01 '13 at 08:19