Real time OCR lag

Question

im trying to capture position of license plate with webcam feed using YOLOv4 tiny then input the result to easyOCR to extract the characters. The detection works well in real time, however when i apply the OCR the webcam stream become really laggy. Is there anyway i can improve this code to make it make it less laggy?

my YOLOv4 detection

#detection
while 1:
    #_, pre_img = cap.read()
    #pre_img= cv2.resize(pre_img, (640, 480))
    _, img = cap.read()
    #img = cv2.flip(pre_img,1)
    hight, width, _ = img.shape
    blob = cv2.dnn.blobFromImage(img, 1 / 255, (416, 416), (0, 0, 0), swapRB=True, crop=False)

    net.setInput(blob)

    output_layers_name = net.getUnconnectedOutLayersNames()

    layerOutputs = net.forward(output_layers_name)

    boxes = []
    confidences = []
    class_ids = []

    for output in layerOutputs:
        for detection in output:
            score = detection[5:]
            class_id = np.argmax(score)
            confidence = score[class_id]
            if confidence > 0.7:
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * hight)
                w = int(detection[2] * width)
                h = int(detection[3] * hight)
                x = int(center_x - w / 2)
                y = int(center_y - h / 2)
                boxes.append([x, y, w, h])
                confidences.append((float(confidence)))
                class_ids.append(class_id)

    indexes = cv2.dnn.NMSBoxes(boxes, confidences, .5, .4)

    boxes = []
    confidences = []
    class_ids = []

    for output in layerOutputs:
        for detection in output:
            score = detection[5:]
            class_id = np.argmax(score)
            confidence = score[class_id]
            if confidence > 0.5:
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * hight)
                w = int(detection[2] * width)
                h = int(detection[3] * hight)

                x = int(center_x - w / 2)
                y = int(center_y - h / 2)

                boxes.append([x, y, w, h])
                confidences.append((float(confidence)))
                class_ids.append(class_id)

    indexes = cv2.dnn.NMSBoxes(boxes, confidences, .8, .4)
    font = cv2.FONT_HERSHEY_PLAIN
    colors = np.random.uniform(0, 255, size=(len(boxes), 3))
    if len(indexes) > 0:
        for i in indexes.flatten():
            x, y, w, h = boxes[i]
            label = str(classes[class_ids[i]])
            confidence = str(round(confidences[i], 2))
            color = colors[i]
            cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
           # detection= cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
            detected_image = img[y:y+h, x:x+w]
            cv2.putText(img, label + " " + confidence, (x, y + 400), font, 2, color, 2)
            #print(detected_image)
            cv2.imshow('detection',detected_image)

            cv2.imwrite('lp5.jpg',detected_image)
            cropped_image = cv2.imread('lp5.jpg')
            cv2.waitKey(5000)
            print("system is waiting")
            result = OCR(cropped_image)
            print(result)

easy OCR function

def OCR(cropped_image):
    reader = easyocr.Reader(['en'], gpu=False)  # what the reader expect from  the image
    result = reader.readtext(cropped_image)
    text = ''
    for result in result:
        text += result[1] + ' '

    spliced = (remove(text))
    return spliced

When you say "the detection works well in real time", what do you mean? The cam is going to deliver 30 frames a second. Do you finish your processing in 33ms? — Tim Roberts, Aug 18 '22 at 03:12
@TimRoberts hi , what i mean is that the detection is able to achieve good fps when i run in using my webcam feed — mandebo, Aug 18 '22 at 04:20

relent95 · Accepted Answer · 2022-09-22T10:59:37.837

There are several points.

cv2.waitKey(5000) in your loop causes some delay even though you keep pressing a key. So remove it if you are not debugging.
You are saving a detected region into a JPEG image file and loading it each time. So pass the region(Numpy array) on the OCR module directly.
EasyOCR is a DNN model based on ResNet, but you are not using a GPU(gpu=False). So use GPU.(See this benchmark by Liao.)
You are creating an easyocr.Reader instance each time in a loop. Creating it requires to load and initialize a DNN model. This is a huge workload causing the major bottleneck. So create only single instance before the loop and reuse it inside a loop.

J_H · Answer 2 · 2022-08-18T03:21:32.760

You are essentially saying "the while loop must be fast." And of course the OCR() call is a bit slow. Ok, good.

Don't call OCR() from within the loop.

Rather, enqueue a request, and let another thread / process / host worry about the OCR computation, while the loop quickly continues upon its merry way.

You could use a threaded Queue, or a subprocess, or blast it over to RabbitMQ or Kafka. The simplest approach would be to simply overwrite /tmp/cropped_image.png within the loop, and have another process notice such updates and (slowly) call OCR(), appending the results to a log file.

There might be a couple of updates to the image file while a single OCR call is in progress, and that's fine. The two are decoupled from one another, each progressing at their own pace. Downside of a queue would be OCR sometimes falling behind -- you actually want to shed load by skipping some (redundant) cropped images.

The two are racing, and that's fine. But take care to do things in atomic fashion -- you wouldn't want to OCR an image that starts with one frame and ends with part of a subsequent frame. Write to a temp file and, after close(), use os.rename() to atomically make those pixels available under the name that the OCR daemon will read from. Once it has a file descriptor open for read, it will have no problem reading to EOF without interference, the kernel takes care of that for us.

hi, can you give me example on how i can apply the queue or subprocess into my code? i have a hard time understanding the concept — mandebo, Aug 18 '22 at 04:41
Let `program1.py` be the OP code you offered above, but using a fast "write to /tmp/cropped_image.png" rather than "slow OCR call". Let `program2.py` be a `while` loop that consults `p.stat().st_mtime`, and calls OCR() whenever that image timestamp changes. Run both programs simultaneously in separate Terminal windows. https://docs.python.org/3/library/pathlib.html#pathlib.Path.stat (Or read and understand the `subprocess` docs. As my post suggested, that would be more work, likely more work than you'd care to do.) — J_H, Aug 19 '22 at 16:37

Real time OCR lag

2 Answers2