1

I have being working on OCR the image by python for a while but there are still rooms to be improved so your input and thoughts will be helpful.

This is what I am currently doing and the ratio of successfully getting a valid ocrText output is around 15%.

ocrImage = cv2.imread(imgName)
ocrImage  = cv2.resize(ocrImage, None, fx=3, fy=3, interpolation=cv2.INTER_LINEAR)  # enlarge 3 times
ocrImage = cv2.cvtColor(ocrImage, cv2.COLOR_BGR2GRAY)  # turn into gray
ret,ocrImage = cv2.threshold(ocrImage,127,255,cv2.THRESH_BINARY)  # conver to balck and white
ocrImage = cv2.morphologyEx(ocrImage, cv2.MORPH_OPEN, np.ones((4,4),np.uint8))  # eliminate the noice 
ocrImage = cv2.morphologyEx(ocrImage, cv2.MORPH_CLOSE, np.ones((4,4),np.uint8))  # make supplement in white dots
cv2.imwrite(ImageName, ocrImage)
ocrText = ocrTool.image_to_string(Image.open(ImageName), builder=pyocr.builders.TextBuilder())

When I tried to make progress of it, I found a 'opencv-color-spaces' blog which uses below code to draw the pixels of the image into a 3d model. I can see all the background noises are in different gray colors and are pretty much in a certain area. I felt this can help me to filter them out before doing it in my code but I have no idea how to do it.

nemo0 = cv2.imread(ImageName1,1)
nemo1 = cv2.cvtColor(nemo0, cv2.COLOR_BGR2RGB)
r, g, b = cv2.split(nemo1)
fig = plt.figure()
axis = fig.add_subplot(1, 1, 1, projection="3d")
pixel_colors = nemo1.reshape((np.shape(nemo1)[0] * np.shape(nemo1)[1], 3))
norm = colors.Normalize(vmin=-1.0, vmax=1.0)
norm.autoscale(pixel_colors)
pixel_colors = norm(pixel_colors).tolist()
axis.scatter(r.flatten(), g.flatten(), b.flatten(), facecolors=pixel_colors, marker=".")
axis.set_xlabel("Red")
axis.set_ylabel("Green")
axis.set_zlabel("Blue")
currentFig1 = plt.gcf()
currentFig1.savefig(ImageName1.replace(Path, pltPath))

I'd like to see for help if you can give me some input on is there a function and do it faster or some code and quickly remove the gray lines before I proceed to process the image?

The example image is in the link here

Alvin Lin
  • 11
  • 4
  • Adding the original input image would help! An approach is to preprocess the image before putting into OCR such as tesseract – nathancy Aug 08 '19 at 20:08
  • Breaking captchas may be unethical. – fmw42 Aug 08 '19 at 23:37
  • Thanks for all your comments. The original input image was in the left-upper corner of the 3D image in the very last line as a link of my question. – Alvin Lin Aug 09 '19 at 02:34

0 Answers0