1

Well tried morphology in Imagemagick, but unfortunately the output also affects the text, thus making it unsatisfactory for ocr. So is there any faster way to remove the lines from image without affecting the text for ocr?

Input image:

Imagemagick code:

magick 1sa.jpg -morphology close:1 "1x4: 0,1,1,0" result.png            

Output image

Edit: Thanks to all those who replied. I finally made it work by the following code:

magick E:\1sa.jpg ( +clone -threshold 50% -negate -statistic median 219x1 ) -compose lighten -composite E:\z1.jpg
fmw42
  • 46,825
  • 10
  • 62
  • 80
  • 1
    Please read [ask]. Show images, code, actual results, expected results – Miki Oct 21 '17 at 07:52
  • _"So is there any faster way ..."_ I'd be more concerned with a _working_ way... Clearly morph operations are not suited for this case – Miki Oct 21 '17 at 08:27
  • Not near a computer, but does this help? https://stackoverflow.com/a/41633319/2836621 – Mark Setchell Oct 21 '17 at 08:39

2 Answers2

2

Your ImageMagick command is erroneous and should not even work. You need to make the kernel a horizontal line not vertical and you need a longer line for the kernel. Try the following:

magick 1sa.jpg -morphology bottomhat "20x1:0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0" -negate result.png  

enter image description here

Adjust the kernel length as needed to optimize your result.

fmw42
  • 46,825
  • 10
  • 62
  • 80
  • thanks it works :) but is thera a way to fill the 1 white pixels near the text which are due to the removal of lines so as to improve ocr accuracy? – thekingmaker Oct 21 '17 at 21:20
  • Not that I know. How would ImageMagick know that the part of the line that was removed was not the line and actually part of the character. Aside: If my previous answer was of help, please consider giving it an up-vote – fmw42 Oct 21 '17 at 21:50
  • Sometimes you need to upvote questions from new StackOverflow users so they have enough points to upvote your answers... ;-) – Mark Setchell Oct 21 '17 at 22:13
0

Is this ImageMagick command any better in the result?

convert 1sa.jpg -morphology bottomhat "20x3:0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0" -negate result.png  

enter image description here

fmw42
  • 46,825
  • 10
  • 62
  • 80
  • nope it doesn't work what i mean is i want the output image so that the ocr dont detect the white pixels(which could be seen by the naked eye) near the text. And thanks for your help :) – thekingmaker Oct 21 '17 at 22:28
  • The background is white. How do I know what white pixels you mean? – fmw42 Oct 21 '17 at 23:21
  • sorry for not explaining properly, i recolored those white pixels into red [1]: https://i.stack.imgur.com/umqer.jpg – thekingmaker Oct 22 '17 at 00:02
  • You could try -morphology open diamond:1 or square:1. But I suspect that will fill in other places that you do not want. Or create your own kernel shaped the way you want to catch those white pixels. Unfortunately, removing long horizontal lines that intersect with your characters is going to remove parts of your characters since it does not know that the line went through your character. – fmw42 Oct 22 '17 at 00:24