0

I am using ffmpeg as a library (libavcodec, libavformat) in my C++ application to put a raw h264 stream (from a camera) into an mp4 container. Essentially I am trying to mimic the behavior of the following call to the ffmpeg executable: ffmpeg -framerate 30 -i camera.h264 -c copy -f mp4 -movflags frag_keyframe+empty_moov -

The code below works more or less and there are no warnings or errors from libavformat/libavcodec printed to stderr. It seems to create an acceptable mp4 file, however all players fail to show it. VLC shows the first frame or so a bit distorted and then stops, other players complain, that the file has an invalid format. File size from ffmpeg and my code slightly differ.

I create the input stream like this (every call is checked in the real code but for simplicity I removed the error handling):

        _io_buffer_size = getpagesize();

        _io_buffer = std::shared_ptr<unsigned char>((unsigned char *) av_malloc(_io_buffer_size), [](unsigned char *ptr) { av_free(ptr); });

        _io_context = std::shared_ptr<AVIOContext>(
                avio_alloc_context(_io_buffer.get(), _io_buffer_size, 0, this, H264Stream::on_read_buffer, nullptr, nullptr),
                [](AVIOContext *ctx) { av_free(ctx); }
        );

        _format_context = std::shared_ptr<AVFormatContext>(
                avformat_alloc_context(),
                [](AVFormatContext *ctx) { avformat_free_context(ctx); }
        );

        const auto h264_input_format = av_find_input_format("h264");
        _format_context->pb = _io_context.get();

        std::stringstream fps_stream;
        std::stringstream size_stream;
        fps_stream << fps;
        size_stream << width << "x" << height;

        AVDictionary *input_options = nullptr;
        av_dict_set(&input_options, "framerate", fps_stream.str().c_str(), 0);
        av_dict_set(&input_options, "r", fps_stream.str().c_str(), 0);
        av_dict_set(&input_options, "s", size_stream.str().c_str(), 0);

        auto formatPtr = _format_context.get();
        auto res = avformat_open_input(&formatPtr, "(memory file)", h264_input_format, &input_options);

        AVCodec* decoder = nullptr;
        res = av_find_best_stream(formatPtr, AVMEDIA_TYPE_VIDEO, 0, -1, &decoder, 0);

And the output format context is created like this:

        _io_buffer_size = getpagesize();

        _io_buffer = std::shared_ptr<unsigned char>((unsigned char *) av_malloc(_io_buffer_size), [](unsigned char *ptr) { av_free(ptr); });

        _io_context = std::shared_ptr<AVIOContext>(
                avio_alloc_context(_io_buffer.get(), _io_buffer_size, 1, this, nullptr, H264Conversion::on_write_data, nullptr),
                [](AVIOContext *ctx) { av_free(ctx); }
        );

        auto output_format = av_guess_format(output_extension.c_str(), nullptr, nullptr);

        AVFormatContext* format_context = nullptr;
        const auto rc = avformat_alloc_output_context2(&format_context, output_format, nullptr, nullptr);

        const auto codec = avcodec_find_encoder(AV_CODEC_ID_H264);
        log->info("Output format: {}", codec->name);
        auto video_stream = avformat_new_stream(format_context, codec);
        video_stream->codecpar->codec_type = AVMEDIA_TYPE_VIDEO;
        video_stream->codecpar->width = _stream->width();
        video_stream->codecpar->height = _stream->height();
        video_stream->codecpar->codec_id = AV_CODEC_ID_H264;

        AVStream** streams = new AVStream*[1];
        streams[0] = video_stream;
        format_context->nb_streams = 1;
        format_context->streams = streams;

        _format_context = std::shared_ptr<AVFormatContext>(format_context, [](AVFormatContext* ctx) { avformat_free_context(ctx); });

        _format_context->pb = _io_context.get();

        _converter_thread = std::thread{[=]() { process_conversion(); }};

The method process_conversion essentially performs a read packet -> write packet loop that looks like this:

AVDictionary* dict = nullptr;
        av_dict_set(&dict, "movflags", "frag_keyframe+empty_moov", 0);
        av_dict_set(&dict, "r", "30", 0);
        av_dict_set(&dict, "framerate", "30", 0);

        auto rc = avformat_write_header(_format_context.get(), &dict);

        av_dict_free(&dict);

        auto did_complete_regularly = false;

        std::shared_ptr<AVPacket> packet = std::shared_ptr<AVPacket>(av_packet_alloc(), [](AVPacket *packet) { av_packet_free(&packet); });
        while(_is_running) {
            if (!_stream->read_next_packet(*packet.get())) {
                did_complete_regularly = true;
                break;
            }

            av_write_frame(_format_context.get(), packet.get());
            av_packet_unref(packet.get());
        }

        if(did_complete_regularly) {
            av_write_trailer(_format_context.get());
        }
Yanick Salzmann
  • 1,456
  • 1
  • 15
  • 27
  • 1
    Does the output have a correct avcc box in the header i.e. `moov->trak->mdia->minf->stbl->stsd`? – Gyan Oct 27 '19 at 09:44
  • I am not sure what would be correct, but it says: `[stsd] size=12+98 entry-count = 1 [avc1] size=8+86 data_reference_index = 1 width = 1920 height = 1080 compressor = [avcC] size=8+0` – Yanick Salzmann Oct 27 '19 at 09:53
  • 1
    So, avcc box is empty. Many players will expect to read codec configuration in an MP4 from moov and not the bitstream. So, it should not be empty. The H264 SPS+PPS should be the contents of `avcc` – Gyan Oct 27 '19 at 12:09
  • Can you upload the mp4 somewhere? Then it would be easier to either inspect it with `ffplay` and verbose logging, MP4Box or other tools? Also, the raw `h264` file most likely has not timestamps, and while reading you are not doing anything with the `pts` values could be wrong - you are passing the packets directly without doing anything with the `pts` values. Can you run `ffprobe` on the mp4 with `-show_frames` to see PTS values for the frames? Maybe all the frames are played at once and then the file ends. – Rudolfs Bundulis Oct 30 '19 at 19:47

0 Answers0