Python Opencv: Filter Image for Text Detection

Question

I have these set of images I want to de-noise in order to run OCR on :

I am trying to read the 1973 from the image.

I have tried

import cv2,numpy as np


img=cv2.imread('uxWbP.png',0)
img = cv2.resize(img, (0, 0), fx=2, fy=2)
copy_img=np.copy(img)
#adaptive threshold as the image has different lighting conditions in different areas
thresh = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 21, 2)

contours, _ = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
#kill small contours
for i_cnt, cnt in enumerate(sorted(contours, key=lambda x: cv2.boundingRect(x)[0])):
    _area = cv2.contourArea(cnt)
    x, y, w, h = cv2.boundingRect(cnt)
    x_y_area = w * h
    if 10000 < x_y_area and x_y_area < 400000:
        pass
        # cv2.rectangle(copy_img, (x, y), (x + w, y + h), (255, 0, 255), 2)
        # cv2.putText(copy_img, str(int(x_y_area)) + ' , ' + str(w) + ' , ' + str(h), (x, y + 10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2)
        # cv2.drawContours(copy_img, [cnt], 0, (0, 255, 0), 1)
    elif 10000 > x_y_area:
        #write over small contours
        cv2.drawContours(thresh, [cnt], -1, 255, -1)

cv2.imshow('img',copy_img)
cv2.imshow('thresh',thresh)
cv2.waitKey(0)

Which significantly improves the image to:

Any recommendations on how to filter this image sufficiently either on improvements to the filtered image or complete change from the start, that I could run OCR or some ML detection scripts on this? I'd like to split out the numbers for detection, but open to other methods as well.

Ah, naughty, naughty. Trying to break them CAPTCHA codes, huh? They are difficult to break for a reason. The text segmentation, as you see, is non-trivial. In your particular image, there’s a lot of high-frequency noise, you could try some frequency filtering first and see what result you get. — stateMachine, Feb 20 '20 at 02:59
@eldesgraciado Problems are made to be solved. Can you point me to some code/examples that explains what you mean by frequency filtering? And what point, the prefiltered, or the post filtered image(white & black), do you recommend applying them on — ohyesyoucan, Feb 20 '20 at 03:05
I suggest that this activity is unethical. Attempting to subvert the CAPTCHA protection shows a lack of respect for the owner of the server, since they are doing it to protect their bandwidth and/or their business — fmw42, Feb 20 '20 at 03:19
Random noise cannot in general be removed by frequency domain filtering. Frequency filtering (notch filtering) is likely only useful if the noise is arranged in a repetitive pattern — fmw42, Feb 20 '20 at 03:21
> since they are doing it to protect their bandwidth and/or their business. This captcha does not protect anything of the sort. You'll just have to take my word for it. Any good server uses something like google's recaptcha which is machine learning image recognition of things like sidewalks, which is not the problem here. — ohyesyoucan, Feb 20 '20 at 03:25
I have applied high-order bandstop filtering (Butterworth) to attempt to remove random noise with different kinds of results. It's very difficult to recover the original signal. I honestly don't know if that approach could solve your problem. — stateMachine, Feb 20 '20 at 03:40
Could the OP post an original version of the CAPTCHA? I want to try Gaussian blur and then do the adaptive-threshold-contour thing. However, even trying to reproduce your steps with the small, colored CAPTCHA, I get `cv2.error: OpenCV(4.2.0) C:\projects\opencv-python\opencv\modules\imgproc\src\thresh.cpp:1647: error: (-215:Assertion failed) src.type() == CV_8UC1 in function 'cv::adaptiveThreshold'` — bballdave025, Feb 20 '20 at 04:24
Got it. Now I should be able to reproduce your results and try some other things. This is a fun problem. — bballdave025, Feb 20 '20 at 04:32
Wait ... how did you read in the image. I tried `img = cv2.imread("uxWbp.png")`. However, when I ran through your steps after doing that, and after using your different values, I still got the same error. Sorry, I know more about audio processing, but I love messing around with OCR. All that to say that I could use your help reproducing your original results. Oh, you just put up the code. I'll probably look at this more tomorrow. Hopefully, you'll have it solved by then, and I can learn from what you figured out. — bballdave025, Feb 20 '20 at 04:37
@bballdave025, see the code now. I've added reading and resizing, should get you a result close to the black and white image. — ohyesyoucan, Feb 20 '20 at 04:37
Does this answer your question? [Captcha preprocessing and solving with Opencv and pytesseract](https://stackoverflow.com/questions/45680624/captcha-preprocessing-and-solving-with-opencv-and-pytesseract) — T A, Feb 20 '20 at 07:59
no, that image was somehow easier. similar techniques do not work on this image. The resulting image after doing those steps (they are fewer and less complicated than tried here) has far more noise in it, The noise here is of equal brightness and color as the text, so you can't filter it out with thresholding as they have done. — ohyesyoucan, Feb 20 '20 at 16:46

bballdave025 · Answer 1 · 2020-03-01T17:05:17.760

My first thought is to put on a Gaussian blur for a sort of "unsharp filter". (I think my second idea is better; it combines this blur-and-add with the erosion/dilation game. I posted it as a separate answer, because I think it is a different-enough strategy to merit that.) @eldesgraciado noted frequency stuff, which is basically what we're doing here. I'll put on some code and explanation. (Here is one answer to an SO post that has a lot about sharpening - the answer linked is a more variable unsharp mask written in Python. Do take the time to look at other answers - including this one, one of many simple implementations that look just like mine - though some are written in different programming languages.) You'll need to mess with parameters. It's possible this won't work, but it's the first thing I thought of.

>>> import cv2
>>> im_0 = cv2.imread("FWM8b.png")
>>> cv2.imshow("FWM8b.png", im_0)
>>> cv2.waitKey(0)
## Press any key.
>>> ## Here's where we get to frequency. We'll use a Gaussian Blur.
    ## We want to take out the "frequency" of changes from white to black
    ## and back to white that are less than the thickness of the "1973"
>>> k_size = 0 ## This is the kernal size - the "width frequency",
               ## if you will. Using zero gives a width based on sigmas in
               ## the Gaussian function.
               ## You'll want to experiment with this and the other
               ## parameters, perhaps trying to run OCR over the image
               ## after each combination of parameters.
               ## Hint, avoid even numbers, and think of it as a radius
>>> gs_border = 3
>>> im_blurred = cv2.GaussianBlur(im_0, (k_size, k_size), gs_border)
>>> cv2.imshow("gauss", im_blurred)
>>> cv2.waitKey(0)

Okay, my parameters probably didn't blur this enough. The parts of the words that you want to get rid of aren't really blurry. I doubt you'll even see much of a difference from the original, but hopefully you'll get the idea.

We're going to multiply the original image by a value, multiply the blurry image by a value, and subtract value*blurry from value*orig. Code will be clearer, I hope.

>>> orig_img_multiplier = 1.5
>>> blur_subtraction_factor = -0.5
>>> gamma = 0
>>> im_better = cv2.addWeighted(im_0, orig_img_multiplier, im_blurred, blur_subtraction_factor, gamma)
>>> cv2.imshow("First shot at fixing", im_better)

Yeah, not too much different. Mess around with the parameters, try to do the blur before you do your adaptive threshold, and try some other methods. I can't guarantee it will work, but hopefully it will get you started going somewhere.

Edit This is a great question. Responding to the tongue-in-cheek criticism of @eldesgraciado

Ah, naughty, naughty. Trying to break them CAPTCHA codes, huh? They are difficult to break for a reason. The text segmentation, as you see, is non-trivial. In your particular image, there’s a lot of high-frequency noise, you could try some frequency filtering first and see what result you get.

I submit the following from the Wikipedia article on reCAPTCHA (archived).

reCAPTCHA has completely digitized the archives of The New York Times and books from Google Books, as of 2011.three The archive can be searched from the New York Times Article Archive.four Through mass collaboration, reCAPTCHA was helping to digitize books that are too illegible to be scanned by computers, as well as translate books to different languages, as of 2015.five

Also look at this article (archived).

I don't think this CAPTCHA is part of Massive-scale Online Collaboration, though.

Edit: Some other type of sharpening will be needed. I just realized that I'm applying 1.5 and -0.5 multipliers to pixels which usually have values very close to 0 or 255, meaning I'm probably just recovering the original image after the sharpening. I welcome any feedback on this.

Also, from comments with @eldesgracio:

Someone probably knows a better sharpening algorithm than the one I used. Blur it enough, and maybe threshold on average values over an n-by-n grid (pixel density). I don't know to much about the whole adaptive-thresholding-then-contours thing. Maybe that could be re-done after the blurring...

Just to give you some ideas ...

Here's a blur with k_size = 5

Here's a blur with k_size = 25

Note those are the BLURS, not the fixes. You'll likely need to mess with the orig_img_multiplier and blur_subtraction_factor based on the frequency (I can't remember exactly how, so I can't really tell you how it's done.) Don't hesitate to fiddle with gs_border, gamma, and anything else you might find in the documentation for the methods I've shown.

Good luck with it.

By the way, the frequency is more something based on the 2-D Fast Fourier Transform, and possibly based on kernel details. I've just messed around with this stuff myself - definitely not an expert AND definitely happy if someone wants to give more details - but I hope I've given a basic idea. Adding some jitter noise (up and down or side to side blurring, rather than radius-based), might be helpful as well.

Yeah, that's a possible solution. You could also try a little bit of erosion on your last image to try and break the noisy blobs from the real ones, however, the number 7 could be easily degraded. — stateMachine, Feb 20 '20 at 03:59
Good point, @eldesgraciado, about the seven. I think it would be better to try the Gaussian blur on the original image, but I don't have that at the correct size. Even so, I think the 7 might still be a problem. — bballdave025, Feb 20 '20 at 04:02
Someone probably knows a better sharpening algorithm than the one I used. Blur it enough, and maybe threshold on average values over an n-by-n grid. I don't know to much about the whole adaptive-thresholding-then-contours thing. Maybe that could be re-done after the blurring... — bballdave025, Feb 20 '20 at 04:04

bballdave025 · Answer 2 · 2020-02-29T18:33:55.337

Another thing to try - either separately from the blurring (or in combination with it - is the erosion/dilation game, as hinted at in the comment by @eldesgraciado , to whom I think a good part of the credit for these answers should go.

These two (erosion and dilation) can be applied one after the other, repeatedly. I think the trick is to change the kernel size. Anyway, I know I've used that to reduce noise in the past. Here's one example of dilation:

>>> import cv2
>>> import numpy as np
>>> im_0 = cv2.imread("FWM8b.png")
>>> k_size = 3
>>> kernel = np.ones((k_size, k_size), np.uint8)
>>> im_dilated = cv2.dilate(im_0, kernel, iterations=1)
>>> cv2.imshow("d", im_dilated)
>>> cv2.waitKey(0)

Make whatever kernel you want for erosion, and check out the effects.

>>> im_eroded = cv2.erode(im_0, kernel, iterations=1)
>>> cv2.imshow("erosion", im_eroded)
>>> cv2.waitKey(0)

Edit with possible improvements:

>>> im_blurred = cv2.GaussianBlur(im_dilated, (0, 0), 3)
>>> im_better = cv2.addWeighted(im_0, 0.5, im_blurred, 1.2, 0) 
# Getting closer.

^ dilated, blurred, and combined (added) with original, 1st way

# Even better, I think.
im_better2 = cv2.addWeighted(im_0, 0.9, im_blurred, 1.7, 0)

^ dilated, blurred, and combined (added) with original, 2nd way

You could do artifact removal, but be careful not to get rid of the stalk of the 7. If you can keep the 7 together, you can do connected-component analysis and keep the biggest connected components.

You could sum the values of pixels on each column and each row, which would probably lead to something like this (very approximated - almost time for work). Note that I was much more careful with the green curve - sums of columns - but the consistency of scaling is probably off.

Note that this is more a sum of (255 - pixel_value). That could find you rectangles where your to-be-found glyphs (digits) should be. You could do a 2-d map of column_pixel_sum + row_pixel_sum, or just do some approximation, as I have done below.

Also to feel free to rotate the image (or take pixel sums at different angles), and combine your info for each rotation.

Lots of other things to try ... the suggestion by @eldesgraciado of a noise model is especially intriguing.

Another thing you could try out is to create a "noise model" and subtract it from the original image. First, take the image and apply Gaussian Blur with very low parameters, just barely blurring it, next subtract this mask from the image. From here, the steps are experimental: The difference should be again blurred and thresholded. Save this image. You run this pre-processing with various parameters and saving each time the final binary image, then, average the masks obtained so far. The persistent blobs should be the ones you are looking for... like some sort of spatial bandstop, I guess...

Keep experimenting.

Unsharp mask (my other answer) on this result image. More noise gone, but hurts the 7.

`>>> im_blurred = cv2.GaussianBlur(im_dilated, (0, 0), 3)` ; `>>> im_better = cv2.addWeighted(im_0, 0.5, im_blurred, 1.2, 0)` Getting closer. — bballdave025, Feb 20 '20 at 05:40
`im_better = cv2.addWeighted(im_0, 0.9, im_blurred, 1.7, 0)` — bballdave025, Feb 20 '20 at 05:42
can you edit your post and post your images, not following with the blurred stuff — ohyesyoucan, Feb 20 '20 at 05:50
Another thing you could try out is to create a "noise model" and subtract it from the original image. First, take the image and apply Gaussian Blur with very low parameters, just barely blurring it, next subtract this mask from the image. From here, the steps are experimental: The difference should be again blurred and thresholded. Save this image. You run this pre-processing with various parameters and saving each time the final binary image, then, average the masks obtained so far. The persistent blobs should be the ones you are looking for... like some sort of spatial bandstop, I guess... — stateMachine, Feb 20 '20 at 05:55
Try running an unsharp mask (the idea from my other answer) on the result. I did this with naive parameters and got stuff even cleaner, but it wiped out the stalk of the 7. That's why I turned to the column-sum and row-sum stuff. `>>> im_even_better = cv2.GaussianBlur(im_better, (0,0), 3)` ; `>>> im_even_betterer = cv2.addWeighted(im_even_better, 1.5, im_better, -0.5, 0)`. [image](https://i.stack.imgur.com/pIned.png) — bballdave025, Feb 20 '20 at 14:43
For @ohyesyoucan : Did I get the images you wanted? I'm not sure which ones were being requested. — bballdave025, Feb 20 '20 at 15:03

Python Opencv: Filter Image for Text Detection

2 Answers2