1

I am brazilian student and in pt-stackoverflow didn't find nothing about this. I'm a newbie in python and opencv, it's being hard to study about.
I'm trying to do an OCR program in python that can identify multiple lines and words with webcam provided video.

I'm trying with static images first for test and I've already tried with the code in opencv tutorials, like this, but only return 1 line or cover the words

# single line
    if lines is not None:
        for rho, theta in lines[0]:
            a = np.cos(theta)
            b = np.sin(theta)
            x0 = a * rho
            y0 = b * rho
            x1 = int(x0 + 800 * (-b))
            y1 = int(y0 + 800 * (a))
            x2 = int(x0 - 800 * (-b))
            y2 = int(y0 - 800 * (a))
            cv2.line(cap, (x1, y1), (x2, y2), (0, 0, 255), 2)
        cv2.imshow("windowName", cap)

# -
# demark text with multiple lines
# -

if True:  # HoughLinesP
        lines = cv.HoughLinesP(dst, 1, math.pi/180.0, 40, np.array([]), 30, 10)
        a, b, c = lines.shape
        for i in range(a):
            cv.line(cdst, (lines[i][0][0], lines[i][0][1]), (lines[i][0][2], lines[i][0][3]), (0, 0, 255), 3, cv.LINE_AA)

    else:  # HoughLines
        lines = cv.HoughLines(dst, 1, math.pi/180.0, 50, np.array([]), 0, 0)
        if lines is not None:
            a, b, c = lines.shape
            for i in range(a):
                rho = lines[i][0][0]
                theta = lines[i][0][1]
                a = math.cos(theta)
                b = math.sin(theta)
                x0, y0 = a*rho, b*rho
                pt1 = (int(x0+1000*(-b)), int(y0+1000*(a)))
                pt2 = (int(x0-1000*(-b)), int(y0-1000*(a)))
                cv.line(cdst, pt1, pt2, (0, 0, 255), 3, cv.LINE_AA)

    cv.imshow("detected lines", cdst)

In first part of code, i'll have only one line marked, and in second part have multiple lines, buth they are in front of words.

![1]: https://i.stack.imgur.com/Sm5FP.png "single line" ![2]: https://i.stack.imgur.com/IASaE.png "multiple lines"

I would like multiple lines and a mode to recognize the words in the line, as the example image below.

![3]: https://i.stack.imgur.com/RVafY.png "multiple lines" ![4]: https://i.stack.imgur.com/w0DG3.png "my objective"

Sorry for a big text, but I have no one to help me here, I'm two steps to give up.

Extra info: contours code

sorted_ctrs = sorted(ctrs, key=lambda ctr: cv2.boundingRect(ctr)[0])

for i, ctr in enumerate(sorted_ctrs):
    x, y, w, h = cv2.boundingRect(ctr)
    roi = image[y:y + h, x:x + w]
    cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)

    if w > 15 and h > 15:
        im = Image.fromarray(roi)
        text = pytesseract.image_to_string(im)
        print text
        voiceEngine.say(text)
        voiceEngine.runAndWait()
  • So you have problem in detecting the lines and words? Or you have problem in drawing lines/boxes around them. – ZdaR Oct 17 '19 at 07:29
  • Welcome to SO! The question is a bit unclear. Do you want to recognize words in the line or all words or all lines? PS. Don't give up, you are almost there ;) – Rick M. Oct 17 '19 at 08:58
  • Thanks for replies, community are amazing! Mr @RickM. my true objective is an OCR with webcam that can transcript word by word, line by line. I guess is necessary identify the lines and the words to avoid read the same word/line twice, right? I have no one orient me, maybe I’m thinking in the wrong direction. Mr. ZdaR I can’t draw lines in each text segment (like the figure 3) with my houghlines code, it only return one single line (fig 1). I edited the question with the ROI code that i’m using to draw boxes around the words (with some issues), passing the text inside them to pytesseract. – Rafael Gilberto Oct 17 '19 at 13:53

0 Answers0