Handle large number of rtsp cameras without real-time constraint

Question

Problem definition

Say that I have a large number of rtsp cameras (> 100) and I want to perform some operation on it, like image features extraction.

Important: I am not interested in real-time performance, I can do the features extraction even 4 times in a minute. Obviously, the more, the better!

As now, the bottleneck is image acquisition. The frames are acquired with cv2 Read the section below for what I tried.

Pseudocode (current solution)

while True:
    for every rstp_cameras:
        open_connection
        read_current_frame(no batch - the real time frame)
        process_frame
        close

What I have tried

Here on stackoverflow you can find a lot of answers about reading rtsp cameras in real-time, but all are limited on the number of cameras or have some drawbacks. I tried (with python):

A thread for each camera [cv2 with ffmpeg]
- Open a connection for each camera in a thread, then get the last frame available for each camera.
- This solution works, but only with a small number of cameras. If we increase the number, a high-end cpu will be 100% on usage (because the thread, in the background, are always reading the last frame and discard it if I am not asking the last)
[Current solution, no thread, ffmpeg with cv2] Open a connection at every iteration, read the frame and close the connection. This solution allows me to have the last frame available, but the major drawback is the time lost during the opening (~70s lost to open all the frames)
Cv2 with gstreamer, no thread
- Based on this answer. Is the best solution that I found if you have a small number of cameras. With 20 or plus cameras I have the same problem with the threading solution.

Question & recap

Now, It's clear to me that processing all those cameras in one workstation is hard, because all the solutions that I found, in order to return the last frame available (the one in real-time) continuously reading the frame in the background.

For now, I have not found a solution that allows me to open a connection once, read the real time frame with low-cpu usage, so I can use it with high number of cameras.

Is the parallelization of the reading the only way to solve the problem? I mean, split the cameras into batches, assign batches at a different workstation and then combine the images in some ways?

Thank you.

time to dig into ffmpeg's API and use it directly. OpenCV has _convenience_ interfaces for video I/O. they aren't made to bear loads. — Christoph Rackwitz, Dec 03 '21 at 01:14

Rotem · Answer 1 · 2021-12-03T00:10:10.893

You may try using FFmpeg with -discard nokey arguments as described here.
I am not sure if it's going to work...

As far as I understand -discard nokey decodes only the key frames, and skips all the other video frames.
Decoding only key frames suppose to reduce the CPU usage by a lot (depending the frequency of the key frames).

Here is a code sample (for one RTSP stream):

import numpy as np
import subprocess as sp


# Use public RTSP Streaming for testing:
in_stream = "rtsp://wowzaec2demo.streamlock.net/vod/mp4:BigBuckBunny_115k.mov"

# Use OpenCV for getting the video resolution.
cap = cv2.VideoCapture(in_stream)

# Get resolution of input video
width  = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

# Release VideoCapture - it was used just for getting video resolution
cap.release()


# http://zulko.github.io/blog/2013/09/27/read-and-write-video-frames-in-python-using-ffmpeg/
FFMPEG_BIN = "ffmpeg" # on Linux ans Mac OS (also works on Windows when ffmpeg.exe is in the path)

ffmpeg_cmd = [FFMPEG_BIN,
              '-discard', 'nokey',
              '-rtsp_transport', 'tcp',
              '-max_delay', '30000000',  # 30 seconds
              '-i', in_stream,
              '-f', 'rawvideo',
              '-pix_fmt', 'bgr24',
              '-vcodec', 'rawvideo', '-an', 'pipe:']

# Open sub-process that gets in_stream as input and uses stdout as an output PIPE.
process = sp.Popen(ffmpeg_cmd, stdout=sp.PIPE)


while True:
    raw_frame = process.stdout.read(width*height*3)

    if len(raw_frame) != (width*height*3):
        print('Error reading frame!')  # Break the loop in case of an error (too few bytes were read).
        break

    # Transform the byte read into a numpy array, and reshape it to video frame dimensions
    frame = np.frombuffer(raw_frame, np.uint8).reshape((height, width, 3))

    # Show frame for testing
    cv2.imshow('frame', frame)
    cv2.waitKey(1)
  
process.stdout.close()
process.wait()
cv2.destroyAllWindows()

Try to open 100 FFmpeg sub-processes as above...

Thanks for the response! Yeah, it seems to have a low impact on the cpu. However, I did this experiment and it doesn't work. I created 100 processes, iterate over them (inside the ```while True```) and read the frame. At the last iteration (i.e when I reached the 100-th cameras, I inserted a ```time.sleep(8)```. Then, if I watch the images that I saved, I can see that they doesn't reflect the reality, they accumulate high delay, certainly greater than 8 seconds, as if they were cached. This is a problem, I want that when I read the frame this reflects the reality. — Francesco Taioli, Dec 03 '21 at 09:16
For avoiding accumulation, create a reading thread for every camera (drain the pipe). Simply iterating over the cameras is going to work only in a very specific case, when the GOP size is the same for all video streams, and when you can read the frames fast enough to avoid accumulation. — Rotem, Dec 03 '21 at 09:31
I think that inserting ```thread``` for draining the pipe could saturate the cpu, as the other solution involving thread did. — Francesco Taioli, Dec 03 '21 at 09:45
I don't think the threads are going to saturate the CPU, because executing `raw_frame = process.stdout.read(width*height*3)` is not CPU intensive - the thread is supposed to be in pending state until the data is ready. Even if the waiting (internal Python implementation) is not the most efficient (and there are context switching overhead), it's still going to be relatively light CPU task. — Rotem, Dec 03 '21 at 11:11

Handle large number of rtsp cameras without real-time constraint

Problem definition

Pseudocode (current solution)

What I have tried

Question & recap

1 Answers1

Linked