Reconstruct image from RTP packets

Question

I am trying to stream a user's webcam over the network to a C-based server. I have used Janus gateway.

I created a small plugin which is heavily based on the echotest demo example : I have my browser connecting to my janus server via WebRTC technology and I have it stream the user's webcam.

On the server side, I have janus_incomming_rtp function which gives me a char * buffer and int length. Upon inspection, the buffer of data that's incomming is about the length of the MTU : each frame of my video is sent upon several RTP packets.

I have inspected a bit the header by following this wikipedia page but I don't know how to reconstruct the image from that stream of UDP RTP packets. Ideally, I'd like to pass the stream to openCV to do realtime image processing.

I have heard of gstreamer, but I don't undertstand what it is nor how it could help me ; besides I don't know if openCV has any built in functions to "reconstruct" the images ? And I don't know in which format the video frames are being encoded : the PT (Payload Type) seems to be 116 which is defined as "dynamic" but I have no idea what it means.

Any help ?

From what I have read so far, it seems that gstreamer is the key : anyone can point me toward docs / example of how to configure gstreamer as a WebRTC client ? — nschoe, May 26 '14 at 10:02
You should be able to grab the media format type by what was mentioned in the SDP exchange. The SDP exchange matches payload with format type. Only thing is, you have to make sure that your RTP streams are not multiplexed(audio and video together), chrome has a habit of doing that and it can cause issues with other tech. — Benjamin Trent, May 26 '14 at 14:23
(thanks for answering !). Okay from what I have understood, the webcam stream is encoded with VP8 plugin. I have tried forwarding the incomming RTP packets towards a udp socket on which I had openCV open a capture stream. But apparently, openCV cannot decode VP8 so anyidea how I could convert the stream realtime and pass it to openCV ? — nschoe, May 26 '14 at 14:25

score 5 · Accepted Answer · edited May 23 '17 at 12:10

5

Here are some guiding steps for handling the SRTP packets to decode them.

Make sure that rtp and RTCP are not multiplexed, you can remove that option from the SDP
Decrypt the SRTP packet to raw RTP, you will need access to the key exchange(not sure if you are already doing this but all media is encrypted and keys exchanged using DTLS and must be decrypted before handling)
Grab your media payload type and match it against the media from SDP(you can see from the RTPMAP in the SDP what media is what payload)
Remove the RTP Payload from the packet(Gstreamer has RtpDepay plugins for most common payloads, including VP8) and decode the stream. Quick example of command line pipelines using vp8
Now you have a raw video/audio packet that can be displayed.

SDP:

If RTCP and RTP are being multiplexed you will see the line a=rtcp-mux and you will see that the port in a=rtcp:50111 IN IP4 <address> and the candidate media ports will be the same.
If the media itself is being multiplexed you will see a=group:BUNDLE audio video

SRTP:

Janus handles the DTLS exchange already and it seems that it may already decrypt the rtp before sending it but it does not look like it accounts for multiplexed rtp/rtcp and media.
Here is a quick and dirty SRTP decrypter that works when you pass it the MasterKey that is exchanged in DTLS.

GStreamer:

You may want to look into the GstAppSrc which allows you to char arrays into a gstreamer pipeline for decoding and you can push it to another udp port to grab it with OpenCV.
Here is some example code from a websocket server I wrote that will grab raw media and push it to a pipeline. This example is not exactly what you want to do(it does not grab the RTP but instead raw media frames from the webpage) but it will show you how to use AppSrc.

edited May 23 '17 at 12:10

Community

1
1

answered May 26 '14 at 14:52

Benjamin Trent

7,378
3
31
41

(thanks you for your explanations). I have been lurking around janus's code to try doing what you suggested, but I can't find much. 1. How do I know if RTP and RTCP are multiplexed ? I tried analysing SDP messages, but they are not easy to understand. 2. How do I *decrypt* the SRTP packets to raw RTP and how to get access to the key exchange ? From janus' code, I could find .key and .pem certificate file -> is the private key what you're talking about ? 3. From the SDP messages I got **a=rtpmap:120 VP8/90000** what to do from here ? Thanks ! – nschoe Jun 03 '14 at 13:42
I have been reading the code example you provided on github, I saw that it works with WebSockets (so TCP), what kind of performance can I expect in terms of latency / real-time ? – nschoe Jun 03 '14 at 14:15
thanks for the edits. I spent some time reading doc (and trying alternative solutions). There are some things I still can't figure out : do I need SRTP decrypter (since you apparently Janus handles DTLS decryption) ? I can't find any gstreamer rtp depayloader for vp8, can you point me toward one ? I'll keep reading your source code to get some insights, but I don't know (yet) how I can use the concet. One last question : what's the relation between GstAppSrc and gstreamer ? Again, many many thanks for the help ! – nschoe Jun 12 '14 at 10:51
After investigation, I saw the line a=rtcp-mux (but not the a=rtcp:PORT In ....) so I guess I do multiplex rtp and rtcp. I followed your link but I can't find any explanation to remove it, only the sentence : "If your system do not support rtcp-mux, just don't include it in the INVITE or 200 OK.". But do we do that ? Thanks in advance ! – nschoe Jun 12 '14 at 12:47
You can modify your SDP line by line when it is created before setting it as your local description and sending it as your invite. So, you just need to delete that line in the SDP. – Benjamin Trent Jun 12 '14 at 14:08
1

Thanks for all your advice! I got something working. As information for future users : Janus handles RTP / RTCP demuxing and DTLS handshake. The packets we got in "incoming_rtp_packet" contain the RTP headers and the vp8 payload (which contains a vp8 header). I used gstreamer's rtp depacketization and vp8 decoder to get a hold of the packets and autovideosink displays the stream. Great ! (Thanks Janus' creator, Lorenzo for his precious help too). Now the only thing left is get a cv::Mat from that stream (maybe appsink ?). Anyway, thank you ! – nschoe Jun 18 '14 at 08:03
@nschoe I'm working on a similar problem and happy to hear that you managed to get it working. Any chance for more info on how you made use of GStreamer to depacket & decode the RTP packets? Still trying to figure that part out. Thanks! – logidelic Jul 20 '16 at 14:34
Hi @logidelic, thanks :-) Well right now it begins to dates back. I have to go back to the code to find out. Pretty under fire right now. Is there a specific step you don't understand? – nschoe Jul 20 '16 at 16:04
@nschoe Finally I was able to get this working (thanks again to all here). I've posted my solution as a separate answer in case the code is useful to anyone. – logidelic Jul 29 '16 at 14:25

score 1 · Answer 2 · answered Jul 29 '16 at 14:24

I ended up getting this working using Janus and GStreamer (1.9) by following the suggestions of others in this thread including @nschoe (the OP) and @Benjamin Trent. I figured that I would include my code to make life easier for the next person who comes along since so much trial-and-error was involved for me:

First build/install GStreamer with all its needed plugins (for my setup I needed to ensure that two plugin directories were in the GST_PLUGIN_SYSTEM_PATH environment variable). Now initialize GStreamer when your Janus plugin initializes (init() callback):

gst_init(NULL, NULL);

For each WebRTC session, you'll need to keep some GStreamer handles, so add the following to your Janus plugin session struct:

GstElement *pipeline, *appsrc, *multifilesink;

When a Janus plugin session is created (create_session() callback), setup the GStreamer pipeline for that session (in my case I needed to lower the frame rate, hence the videorate/capsrate; you may not need these):

GstElement *conv, *vp8depay, *vp8dec, *videorate, *capsrate, *pngenc;

session->pipeline = gst_pipeline_new("pipeline");

session->appsrc         = gst_element_factory_make("appsrc", "source");
vp8depay                = gst_element_factory_make("rtpvp8depay", NULL);
vp8dec                  = gst_element_factory_make("vp8dec", NULL);
videorate               = gst_element_factory_make("videorate", NULL);
capsrate                = gst_element_factory_make("capsfilter", NULL);
conv                    = gst_element_factory_make("videoconvert", "conv");
pngenc                  = gst_element_factory_make("pngenc", NULL);
session->multifilesink  = gst_element_factory_make("multifilesink", NULL);

GstCaps* capsRate = gst_caps_new_simple("video/x-raw", "framerate", GST_TYPE_FRACTION, 15, 1, NULL);
g_object_set(capsrate, "caps", capsRate, NULL);
gst_caps_unref(capsRate);

GstCaps* caps = gst_caps_new_simple ("application/x-rtp",
                 "media", G_TYPE_STRING, "video",
                 "encoding-name", G_TYPE_STRING, "VP8-DRAFT-IETF-01",
                 "payload", G_TYPE_INT, 96,
                 "clock-rate", G_TYPE_INT, 90000,
                 NULL);
g_object_set(G_OBJECT (session->appsrc), "caps", caps, NULL);
gst_caps_unref(caps);

gst_bin_add_many(GST_BIN(session->pipeline), session->appsrc, vp8depay, vp8dec, conv, videorate, capsrate, pngenc, session->multifilesink, NULL);
gst_element_link_many(session->appsrc, vp8depay, vp8dec, conv, videorate, capsrate, pngenc, session->multifilesink, NULL);

// Setup appsrc
g_object_set(G_OBJECT (session->appsrc), "stream-type", 0, NULL);
g_object_set(G_OBJECT (session->appsrc), "format", GST_FORMAT_TIME, NULL);
g_object_set(G_OBJECT (session->appsrc), "is-live", TRUE, NULL);
g_object_set(G_OBJECT (session->appsrc), "do-timestamp", TRUE, NULL);

g_object_set(session->multifilesink, "location", "/blah/some/dir/output-%d.png", NULL);
gst_element_set_state(session->pipeline, GST_STATE_PLAYING);

When an incoming RTP packet gets demultiplexed by Janus and is ready to read, (incoming_rtp() callback), feed it into the GStreamer pipeline:

if(video && session->video_active) {
    // Send to GStreamer
    guchar* temp = NULL;
    temp = (guchar*)malloc(len);
    memcpy(temp, buf, len);

    GstBuffer*  buffer = gst_buffer_new_wrapped_full(0, temp, len, 0, len, temp, g_free);
    gst_app_src_push_buffer(GST_APP_SRC(session->appsrc), buffer);
}

Finally, when the Janus plugin session is over (destroy_session() callback), be sure to free up the GStreamer resources:

if(session->pipeline) {
    gst_element_set_state(session->pipeline, GST_STATE_NULL);
    gst_object_unref(session->pipeline);
    session->pipeline = NULL;
}

Can we see your finished project somewhere? I have exactly the same usecase and would like to take a look into your code — Fuzzyma, Jun 08 '17 at 14:22

score 0 · Answer 3 · answered May 29 '14 at 09:27

0

We have the same concern regarding the streaming of WebRTC. What I did is send the video frame to a WebSocket Server, from there I decode the image buffer using imdecode().

I have a live demo here twistedcv and also host the source in github here twistedcv. But the streaming is not on real-time.

answered May 29 '14 at 09:27

Greatxam Darthart

158
1
15

The link seems dead. By the way, sending frames relying on WebSocket means TCP connection, right ? Is it possible to do 30fps with TCP ? – nschoe Jun 03 '14 at 13:46
I checked on the link and its seem the websocket server is down. I restart the server, it should be working now. Yes its a TCP. It doesn't thats why its not streaming on real-time. – Greatxam Darthart Jun 06 '14 at 02:27
Well I absolutely need real time, so I can't use the solution. Thakns anyway ! – nschoe Jun 12 '14 at 10:52

Reconstruct image from RTP packets

3 Answers3

Linked