How to set pts and dts of AVPacket from RTP timestamps while muxing VP8 RTP stream to webm using ffmpeg libavformat?

Question

I am using ffmpeg libavformat library to write a video only webm file. I recieve VP8 encoded rtp stream on my server. I have successfully grouped the rtp byte stream (from rtp payload) into individual frames, and constructed a AVPacket. I am NOT re-encoding the payload to VP8 here as it is already vp8 encoded.

I am writing the AVPacket to the file using av_write_interleaved() method. Though I am getting a webm file as output, it is not playing at all. When I checked for the info on the file using mkv tool's 'mkvinfo' command, I found the following info :

+ EBML head
|+ EBML version: 1
|+ EBML read version: 1
|+ EBML maximum ID length: 4
|+ EBML maximum size length: 8
|+ Doc type: webm
|+ Doc type version: 2
|+ Doc type read version: 2
+ Segment, size 2142500
|+ Seek head (subentries will be skipped)
|+ EbmlVoid (size: 170)
|+ Segment information
| + Timestamp scale: 1000000
| + Multiplexing application: Lavf58.0.100
| + Writing application: Lavf58.0.100
| + Duration: 78918744.480s (21921:52:24.480)
|+ Segment tracks
| + A track
|  + Track number: 1 (track ID for mkvmerge & mkvextract: 0)
|  + Track UID: 1
|  + Lacing flag: 0
|  + Name: Video Track
|  + Language: eng
|  + Codec ID: V_VP8
|  + Track type: video
|  + Default duration: 1.000ms (1000.000 frames/fields per second for a 
video track)
|  + Video track
|   + Pixel width: 640
|   + Pixel height: 480
|+ Tags
| + Tag
|  + Targets
|  + Simple
|   + Name: ENCODER
|   + String: Lavf58.0.100
| + Tag
|  + Targets
|   + TrackUID: 1
|  + Simple
|   + Name: DURATION
|   + String: 21921:52:24.4800000
|+ Cluster

As we can see, the duration of the stream is very disproportionately high. (My valid stream duration should be around 8-10 secs). And, the frame rate in the track info is also not what I am setting it to be. I am setting frame rate as 25 fps.

I am applying av_scale_q(rtpTimeStamp, codec_timebase, stream_timebase) and setting the rescaled rtpTimeStamp as pts and dts values. My guess is my way of setting pts and dts is wrong. Please help me how to set pts and dts values on the AVPacket, so as get a working webm file with proper meta info on it.

EDIT :

The following is the code I call to init the library :

 #define STREAM_FRAME_RATE 25
 #define STREAM_PIX_FMT AV_PIX_FMT_YUV420P 

 typedef struct OutputStream {
   AVStream *st;
   AVCodecContext *enc;
   AVFrame *frame;
 } OutputStream;


 typedef struct WebMWriter {
      OutputStream *audioStream, *videoStream;
      AVFormatContext *ctx;
      AVOutputFormat *outfmt;
      AVCodec *audioCodec, *videoCodec;
 } WebMWriter;

 static OutputStream audioStream = { 0 }, videoStream = { 0 };

 WebMWriter *init(char *filename)
 {
    av_register_all();

    AVFormatContext *ctx = NULL;
    AVCodec *audioCodec = NULL, *videoCodec = NULL;
    const char *fmt_name = NULL;
    const char *file_name = filename;

    int alloc_status = avformat_alloc_output_context2(&ctx, NULL, fmt_name, file_name);

    if(!ctx)
            return NULL;

    AVOutputFormat *fmt = (*ctx).oformat;

    AVDictionary *video_opt = NULL;
    av_dict_set(&video_opt, "language", "eng", 0);
    av_dict_set(&video_opt, "title", "Video Track", 0);

    if(fmt->video_codec != AV_CODEC_ID_NONE)
    {
            addStream(&videoStream, ctx, &videoCodec, AV_CODEC_ID_VP8, video_opt);
    }

 if(videoStream.st)
            openVideo1(&videoStream, videoCodec, NULL);

    av_dump_format(ctx, 0, file_name, 1);

    int ret = -1;
    /* open the output file, if needed */
    if (!(fmt->flags & AVFMT_NOFILE)) {
            ret = avio_open(&ctx->pb, file_name, AVIO_FLAG_WRITE);
            if (ret < 0) {
                    printf("Could not open '%s': %s\n", file_name, av_err2str(ret));
                    return NULL;
            }
    }

    /* Write the stream header, if any. */
    AVDictionary *format_opt = NULL;
    ret = avformat_write_header(ctx, &format_opt);
    if (ret < 0) {
            fprintf(stderr, "Error occurred when opening output file: %s\n",
                            av_err2str(ret));
            return NULL;
    }


    WebMWriter *webmWriter = malloc(sizeof(struct WebMWriter));
    webmWriter->ctx = ctx;
    webmWriter->outfmt = fmt;
    webmWriter->audioStream = &audioStream;
    webmWriter->videoStream = &videoStream;
    webmWriter->videoCodec = videoCodec;

    return webmWriter;
 }

The following is the openVideo() method :

 void openVideo1(OutputStream *out_st, AVCodec *codec, AVDictionary *opt_arg)
 {       
    AVCodecContext *codec_ctx = out_st->enc;
    int ret = -1;
    AVDictionary *opt = NULL;
    if(opt_arg != NULL)
    {       
            av_dict_copy(&opt, opt_arg, 0);
            ret = avcodec_open2(codec_ctx, codec, &opt);
    }
    else
    {       
            ret = avcodec_open2(codec_ctx, codec, NULL);
    }

    /* copy the stream parameters to the muxer */
    ret = avcodec_parameters_from_context(out_st->st->codecpar, codec_ctx);
    if (ret < 0) {
            printf("Could not copy the stream parameters\n");
            exit(1);
    }

 }

The following is the addStream() method :

 void addStream(OutputStream *out_st, AVFormatContext *ctx, AVCodec **cdc, enum AVCodecID codecId, AVDictionary *opt_arg)
 {

    (*cdc) = avcodec_find_encoder(codecId);
    if(!(*cdc)) {
            exit(1);
    }

    /*as we are passing a NULL AVCodec cdc, So AVCodecContext codec_ctx will not be allocated, we have to do it explicitly */
    AVStream *st = avformat_new_stream(ctx, *cdc);
    if(!st) {
            exit(1);
    }

    out_st->st = st;
    st->id = ctx->nb_streams-1;

    AVDictionary *opt = NULL;
    av_dict_copy(&opt, opt_arg, 0);
    st->metadata = opt;

    AVCodecContext *codec_ctx = st->codec;
    if (!codec_ctx) {
            fprintf(stderr, "Could not alloc an encoding context\n");
            exit(1);
    }
    out_st->enc = codec_ctx;

    codec_ctx->strict_std_compliance = FF_COMPLIANCE_EXPERIMENTAL;

 switch ((*cdc)->type) {
            case AVMEDIA_TYPE_AUDIO:
                    codec_ctx->codec_id = codecId;
                    codec_ctx->sample_fmt  = AV_SAMPLE_FMT_FLTP;
                    codec_ctx->bit_rate    = 64000;
                    codec_ctx->sample_rate = 48000;
                    codec_ctx->channels    = 2;//1;
                    codec_ctx->channel_layout = AV_CH_LAYOUT_STEREO; 
                    codec_ctx->codec_type = AVMEDIA_TYPE_AUDIO;
                    codec_ctx->time_base = (AVRational){1,STREAM_FRAME_RATE};


                    break;

            case AVMEDIA_TYPE_VIDEO:
                    codec_ctx->codec_id = codecId;
                    codec_ctx->bit_rate = 90000;
                    codec_ctx->width    = 640;
                    codec_ctx->height   = 480;


                    codec_ctx->time_base = (AVRational){1,STREAM_FRAME_RATE};
                    codec_ctx->gop_size = 12;
                    codec_ctx->pix_fmt = STREAM_PIX_FMT;
                    codec_ctx->codec_type = AVMEDIA_TYPE_VIDEO;

                    break;

            default:
                    break;
    }

 /* Some formats want stream headers to be separate. */
    if (ctx->oformat->flags & AVFMT_GLOBALHEADER)
            codec_ctx->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
 }

The following is the code I call to write a frame of data to the file :

 int writeVideoStream(AVFormatContext *ctx, AVStream *st, uint8_t *data, int size, long frameTimeStamp, int isKeyFrame, AVCodecContext *codec_ctx)
 {       
    AVRational rat = st->time_base;
    AVPacket pkt = {0};
    av_init_packet(&pkt);

    void *opaque = NULL;
    int flags = AV_BUFFER_FLAG_READONLY;
    AVBufferRef *bufferRef = av_buffer_create(data, size, NULL, opaque, flags);

    pkt.buf = bufferRef;
    pkt.data = data;
    pkt.size = size;  
    pkt.stream_index  = st->index;

    pkt.pts = pkt.dts = frameTimeStamp;
    pkt.pts = av_rescale_q(pkt.pts, codec_ctx->time_base, st->time_base);
    pkt.dts = av_rescale_q(pkt.dts, codec_ctx->time_base, st->time_base);


    if(isKeyFrame == 1)
            pkt.flags |= AV_PKT_FLAG_KEY;

    int ret = av_interleaved_write_frame(ctx, &pkt);
    return ret;
 }

NOTE : Here 'frameTimeStamp' is the rtp timeStamp on the rtp packet of that frame.

EDIT 2.0 :

My updated addStream() method with codecpars changes :

 void addStream(OutputStream *out_st, AVFormatContext *ctx, AVCodec **cdc, enum AVCodecID codecId, AVDictionary *opt_arg)
 {

    (*cdc) = avcodec_find_encoder(codecId);
    if(!(*cdc)) {
            printf("@@@@@ couldnt find codec \n");
            exit(1);
    }

    AVStream *st = avformat_new_stream(ctx, *cdc);
    if(!st) {
            printf("@@@@@ couldnt init stream\n");
            exit(1);
    }

    out_st->st = st;
    st->id = ctx->nb_streams-1;
    AVCodecParameters *codecpars = st->codecpar;
    codecpars->codec_id = codecId;
    codecpars->codec_type = (*cdc)->type;

    AVDictionary *opt = NULL;
    av_dict_copy(&opt, opt_arg, 0);
    st->metadata = opt;
    //av_dict_free(&opt);

    AVCodecContext *codec_ctx = st->codec;
    if (!codec_ctx) {
            fprintf(stderr, "Could not alloc an encoding context\n");
            exit(1);
    }
    out_st->enc = codec_ctx;

    //since opus is experimental codec
    //codec_ctx->strict_std_compliance = FF_COMPLIANCE_EXPERIMENTAL;

 switch ((*cdc)->type) {
            case AVMEDIA_TYPE_AUDIO:
                    codec_ctx->codec_id = codecId;
                    codec_ctx->sample_fmt  = AV_SAMPLE_FMT_FLTP;//AV_SAMPLE_FMT_U8 or AV_SAMPLE_FMT_S16;
                    codec_ctx->bit_rate    = 64000;
                    codec_ctx->sample_rate = 48000;
                    codec_ctx->channels    = 2;//1;
                    codec_ctx->channel_layout = AV_CH_LAYOUT_STEREO; //AV_CH_LAYOUT_MONO;
                    codec_ctx->codec_type = AVMEDIA_TYPE_AUDIO;
                    codec_ctx->time_base = (AVRational){1,STREAM_FRAME_RATE};

                    codecpars->format = codec_ctx->sample_fmt;
                    codecpars->channels = codec_ctx->channels;
                    codecpars->sample_rate = codec_ctx->sample_rate;

                    break;

            case AVMEDIA_TYPE_VIDEO:
                    codec_ctx->codec_id = codecId;
                    codec_ctx->bit_rate = 90000;
                    codec_ctx->width    = 640;
                    codec_ctx->height   = 480;

                    codec_ctx->time_base = (AVRational){1,STREAM_FRAME_RATE};
                    codec_ctx->gop_size = 12;
                    codec_ctx->pix_fmt = STREAM_PIX_FMT;
                    //codec_ctx->max_b_frames = 1;
                    codec_ctx->codec_type = AVMEDIA_TYPE_VIDEO;
                    codec_ctx->framerate = av_inv_q(codec_ctx->time_base);
                    st->avg_frame_rate = codec_ctx->framerate;//(AVRational){25000, 1000};

                    codecpars->format = codec_ctx->pix_fmt;
                    codecpars->width = codec_ctx->width;
                    codecpars->height = codec_ctx->height;
                    codecpars->sample_aspect_ratio = (AVRational){codec_ctx->width, codec_ctx->height};

                    break;

            default:
                    break;
    }      
    codecpars->bit_rate = codec_ctx->bit_rate;

    int ret = avcodec_parameters_to_context(codec_ctx, codecpars);
    if (ret < 0) {
            printf("Could not copy the stream parameters\n");
            exit(1);
    }

    /* Some formats want stream headers to be separate. */
    if (ctx->oformat->flags & AVFMT_GLOBALHEADER)
            codec_ctx->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
 }

Please add your code where you calculate and set pts values before callig `av_interleaved_write_frame`. So we may see better whats wrong. — the kamilz, Jan 25 '18 at 15:11

score 1 · Answer 1 · answered Jan 29 '18 at 07:49

1

I think you are right about caltulating pts/dts is the problem, use this formula to manually calculate timestamps, see if it works, then you can do it with av_rescale_q.

Here is my tested formula (for raw (yuv) output):

int64_t frameTime;
int64_t frameDuration;

frameDuration = video_st->time_base.den / video_fps; // i.e. 25
frameTime     = frame_count * frameDuration;
pkt->pts      = frameTime / video_st->time_base.num;
pkt->duration = frameDuration;

pkt->dts          = pkt->pts;
pkt->stream_index = video_st->index;

Use this before av_interleaved_write_frame.
Note: frame_count here is a counter that increases after every video frame output (with av_interleaved_write_frame).

answered Jan 29 '18 at 07:49

the kamilz

1,860
1
15
19

Thanks for quick reply. I have used your formula. It did change the duration in the segment info and the duration is coming correctly. But, I can not see any video frames when i play the webm file in vlc. I mean the video is progressing, but no frame is visible in the video. Also, the default duration in the 'Track Info' is still wrong. It is still in 1000 frames per sec, which i have never set. – user2595786 Jan 29 '18 at 09:57
And, when I used av_rescale_q() the pkt duration is longer than it has to be. For example, if actual duration is 10 secs, the duration being set with av_rescale_q() of pts is around 6 mins 30 secs. – user2595786 Jan 29 '18 at 09:58
set frame rate of stream manually with `video_st->avg_frame_rate.num = 25000;` and `video_st->avg_frame_rate.den = 1000`. Put these your `addStream()` `TYPE_VIDEO case`. Note this is for 25 fps video, change accordingly if yours are different. And it's better to use `AVCodecParameters` to set codec parameters like frame width and height. – the kamilz Jan 29 '18 at 10:06
Oh and print out the parameters of `video_st->time_base`. `Den` and `num`, to see if these are correct. – the kamilz Jan 29 '18 at 10:12
I added avg_frame_rate in video_st as you mentioned. After that, the 'Default duartion' in 'Track Info' did change to proper value in accordance with 25 fps. But when I printed the values of video_st->time_base's Den and num, these values are num = 1, Den = 1000 (denoting still it is considering 1000 fps for the calculations). I dont know why this happens. – user2595786 Jan 29 '18 at 10:23
And, still I can't see any video frame in the vlc when I play it. – user2595786 Jan 29 '18 at 10:23
try to change these lines `codec_ctx->time_base = (AVRational){1,STREAM_FRAME_RATE};` to this `codec_ctx->time_base = (AVRational){STREAM_FRAME_RATE, 1};` – the kamilz Jan 29 '18 at 10:30
I made the codec_ctx->time_base change. Still same video_st-time_base value of num = 1, den = 1000. But why is this change needed. My way of setting codec time_base seems to be correct. And, why am I not able to see any video when i play it ? Btw, mine is not raw video. Mine is already VP8 encoded rtp stream. – user2595786 Jan 29 '18 at 10:46
perhaps you should use `AVCodecParameters` to set codec parameters. – the kamilz Jan 29 '18 at 11:38
And did you check this: https://stackoverflow.com/questions/46571544/incorrect-fps-when-muxing-vp9-encoded-data-into-webm – the kamilz Jan 29 '18 at 11:49
I just checked it. I am not able to see the video when I play it in the player, though the fps and duration values are correct. – user2595786 Jan 29 '18 at 12:51
****perhaps you should use AVCodecParameters to set codec parameters. ***** . By this do you mean, setting codec context params using avcodec_parameters_to_context() method from libavcodec's utils.c ? – user2595786 Jan 29 '18 at 12:53
kind of, change `avcodec_parameters_from_context` with `AVCodecParameters *par = out_st->st->codecpar;` and `avcodec_parameters_to_context(codec_ctx, out_st->st->codecpar)` – the kamilz Jan 29 '18 at 14:06
I made the change you told, by setting codecpars using AVCodecParameters and using avcodec_parameters_to_context() method. I have updated my new addStream() method in the post. But, still I ma not able to see the video in the player. All the meta info on the webm file is correct, but the video is not visible. – user2595786 Jan 30 '18 at 07:19
Can you upload a sample video (few minutes may be enough) somewhere So I may download it and analyze? – the kamilz Jan 30 '18 at 09:28
I have uploaded the sample video in my google drive. Here is the link : https://drive.google.com/open?id=17XTgpb6PM27q8oGKa8iHtXbjYcRC8jZ5 – user2595786 Jan 30 '18 at 09:35
Yes, metadata seems correct but something is not right especially in hexdump, is your project compileable with gcc+makefile, if so and if you like you can put the project somewhere and let me test it on my ubuntu server. This may accelerate the fixing process. – the kamilz Jan 30 '18 at 10:52
Sure. But will take a little time to make some changes. I will update you once i am done with those changes. Shall we take the discussion to private? can i have your mail id or something so that we can continue there ? – user2595786 Jan 30 '18 at 11:07
Sure, skype: sbelet@mynet.com. I am at work so I may not answer quickly. – the kamilz Jan 30 '18 at 11:12
Bro, I am not able to send you a mail on the above id. Plz give me your mail id, so that i can send the mail. – user2595786 Jan 30 '18 at 14:49
I was expecting to communicate via Skype, anyway my e-mail is: sayitbelet@gmail.com – the kamilz Jan 31 '18 at 07:03

How to set pts and dts of AVPacket from RTP timestamps while muxing VP8 RTP stream to webm using ffmpeg libavformat?

1 Answers1