2

I'm trying to get a numpy array out of an gstreamer appsink buffer. But the buffer is to small for numpy to fit it in an array. I picked up on a little bit of code from here: Receive Numpy Array Realtime from GStreamer

I use videotestsource that should return someting the size of 240x320 with 3 channels. But the buffer size is 115200. Coincidentally this is half the size of 240x320x3 (230400). I tried other video sources of different resolutions and the buffer is always half the size. My guess is the appsink signals the on_new_sample function too early.

import time
import gi
import numpy as np

gi.require_version("Gst", "1.0")
gi.require_version("GstApp", "1.0")

from gi.repository import Gst, GstApp, GLib

_ = GstApp

def on_new_sample(app_sink):

    sample = app_sink.pull_sample()
    caps = sample.get_caps()

    # Extract the width and height info from the sample's caps
    height = caps.get_structure(0).get_value("height")
    width = caps.get_structure(0).get_value("width")

    # Get the actual data
    buffer = sample.get_buffer()
    print(caps,"buffer size ",buffer.get_size())
    # Get read access to the buffer data
    success, map_info = buffer.map(Gst.MapFlags.READ)

    if not success:
        raise RuntimeError("Could not map buffer data!")

    numpy_frame = np.ndarray(
        shape=(height, width, 3),
        dtype=np.uint8,
        buffer=map_info.data)

    buffer.unmap(map_info)

Gst.init(None)

main_loop = GLib.MainLoop()
pipeline = Gst.parse_launch("""videotestsrc ! video/x-raw, width=320, height=240 ! queue  ! appsink sync=true     max-buffers=1 drop=true name=sink emit-signals=true""")
appsink = pipeline.get_by_name("sink")
pipeline.set_state(Gst.State.PLAYING)
handler_id = appsink.connect("new-sample", on_new_sample)

time.sleep(30)

pipeline.set_state(Gst.State.NULL)
main_loop.quit()

The error I get is as follows:

Traceback (most recent call last):
File "/home/rolf/scripts/tryAlgos/gst_record2.py", line 31, in on_new_sample
numpy_frame = np.ndarray(
TypeError: buffer is too small for requested array

What is the propper way to get from an appsink buffer to a numpy array? I tried some other examples but always returns a buffer that is too small.

user3379159
  • 111
  • 2
  • 12

3 Answers3

3

The unexpected halving in buffer size is because the buffer represents YUV (I420) data and not RGB data. That is why adding videoconvert ! video/x-raw,format=RGB to the pipeline solves the problem, because it converts the YUV data into RGB data before feeding it to the appsink.

How can we tell that this is the problem? Try:

gst-launch-1.0 -v videotestsrc num-buffers=10 ! fakesink silent=false

This runs the videotestsrc and just outputs to the console what is received by the fakesink (as it is not silent). We get the following line as part of the output:

/GstPipeline:pipeline0/GstVideoTestSrc:videotestsrc0.GstPad:src: caps = video/x-raw, format=(string)I420, width=(int)320, height=(int)240, framerate=(fraction)30/1, multiview-mode=(string)mono, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive

As we can see, format=(string)I420, which means we are receiving data in the I420 YUV format. This format is defined here: https://www.fourcc.org/pixel-format/yuv-i420/

As you can see from that link, the Y channel is one byte per pixel, but the U and V channels are each one byte per group of 4 pixels. Thus, the total buffer size is expected to be 320*240 (Y) + 320*240/4 (U) + 320*240/4 (V) = 76800 + 19200 + 19200 = 115200.

pallgeuer
  • 1,216
  • 1
  • 7
  • 17
0

Something like this should do it.

buf = sample.get_buffer()
caps = sample.get_caps()
H, W, C = caps.get_structure(0).get_value('height'), caps.get_structure(0).get_value('width'), 3
arr = np.ndarray(
        (H, W, C),
        buffer=buf.extract_dup(0, buf.get_size()),
        dtype=np.uint8
)
Deep Patel
  • 619
  • 7
  • 8
0

Your implementation is correct.

However, you need to set the correct caps to overcome the size-mismatch issue.

For example, the following pipeline ends up producing a buffer half as long as you would require from height * width * n_channels (in my case 480 * 640 * 3/2):

uridecodebin uri=rtsp://localhost:8000/test ! decodebin ! videoconvert ! appsink emit-signals=True

However, with correct caps, you will get the right size (in my case 480 * 640 * 3):

uridecodebin uri=rtsp://localhost:8000/test ! decodebin ! videoconvert ! video/x-raw, format=RGB ! appsink emit-signals=True
Dharman
  • 30,962
  • 25
  • 85
  • 135
Arnab De
  • 402
  • 4
  • 12