4

I'm trying to find effective binarization techniques for document images. I've currently implemented the niblack and sauvola thresholding algorithms and tried binarization based on histogram evaluation as well. Could someone please suggest other binarization methods that have proved to be effective? Here's a sample degraded image I've been working with:

enter image description here

http://spie.org/Images/Graphics/Newsroom/Imported/0681/0681_fig1.jpg

Any suggestions will be much appreciated.

Maurits
  • 2,082
  • 3
  • 28
  • 32
NeedHelp
  • 101
  • 3
  • 9
  • welcom to stack overflow. While image binarization is an interesting topic, your question is not a good fit for SO. If you have a particular problem with binarization, you can ask a question on http://dsp.stackexchange.com/. If you have a problem about the implementation of binarization, feel free to ask another question on SO. – Simon Bergot Mar 30 '12 at 12:03
  • 1
    Again, Niblack would work. (http://imgur.com/pR1iN You do not need to implement hundreds of algorithms - just understand how they work together and how to adapt the parameters. In your case(s), you should look at local thresholding, and possibly do some preprocessing in respect to color and contrast. – Birgit P. Mar 30 '12 at 13:54
  • thanks for all your help @BirgitP. I'm trying to apply several algorithms and then use them on document images to which i artificially add noise so i can evaluate which is the best method by comparing with the original image. Thats why I'm inquiring about which other methods best suit the purpose.Could you please suggest some?? – NeedHelp Mar 30 '12 at 21:35

1 Answers1

10

How about starting with simply adapting the threshold based on the local neighborhood?

im = rgb2gray(im);
im = im2double(im);
f_makebw = @(I) im2bw(I.data, double(median(I.data(:)))/1.45);
bw = ~blockproc(im, [128 128], f_makebw);

Result:

enter image description here

Maurits
  • 2,082
  • 3
  • 28
  • 32
  • could you please explain what this statement does? f_makebw = @(I) im2bw(I.data, double(median(I.data(:)))/1.45); – NeedHelp Apr 02 '12 at 17:57
  • @NeedHelp, It binarizes each region (here 128x128) based on the median grayscale value of that region. Furthermore the threshold is slightly biased due to the division by 1.45. – Maurits Apr 02 '12 at 18:41
  • I'm getting an error due to ~blocproc saying it is an undefined fuction.Could you please tell me how to correct it, – NeedHelp Apr 03 '12 at 20:08
  • In my example, `blockproc` calls an anonymous function `makebw` which is defined the rule above. You most likely have a typo somewhere. – Maurits Apr 03 '12 at 20:28