ffmpeg convert AV_SAMPLE_FMT_S16 to AV_SAMPLE_FMT_FLTP in c++

Question

I want to encode and decode the sound from my Android app to the Opus format using FFmpeg 4.2.2.

The problem is that my Android app provides a raw PCM sound in AV_SAMPLE_FMT_S16 format, but the FFmpeg opus encoder requires only AV_SAMPLE_FMT_FLTP. So, I decided to resample the sound using FFmpeg swr_convert() function but it crashes with SIGSEGV error and I can't understand why.

My code looks like this:

swrContext = swr_alloc();

av_opt_set_int(swrContext, "in_channel_layout", (int64_t) codecContext->channel_layouts, 0);
av_opt_set_int(swrContext, "out_channel_layout", (int64_t) codecContext->channel_layouts,  0);
av_opt_set_int(swrContext, "in_sample_rate", 8000, 0);
av_opt_set_int(swrContext, "out_sample_rate", 48000, 0);

av_opt_set_sample_fmt(swrContext, "in_sample_fmt", AV_SAMPLE_FMT_S16, 0);
av_opt_set_sample_fmt(swrContext, "out_sample_fmt", AV_SAMPLE_FMT_FLTP,  0);

swr_init(swrContext);

memcpy(frame->data[0], data, dataSize); 

uint8_t *outBuffer = (uint8_t *) malloc(sizeof(uint8_t) * frame->nb_samples);

swr_convert(swrContext, &outBuffer, frame->nb_samples, (const uint8_t **)frame->data, frame->nb_samples);

I am new to C++ so sorry for some mistakes if I made them.

You are only allocating 1 byte to outBuffer. And swr_convert is writing much more than that. — szatmary, Apr 07 '20 at 13:35
thank you, but how I can allocate enough bytes? I don't have much knowledge in c++, but I assume I need to allocate count of bytes equal to frame->nb_samples, can you tell me how to do that? — easy_breezy, Apr 07 '20 at 13:50
@szatmary I updated my answer, but still get SIGSEGV (address access protected) error, I also tried to allocate much more bytes for outBuffer e.g. 1000 and 100000, but result the same, what I do wrong? — easy_breezy, Apr 08 '20 at 03:27
@easy_breezy: Is `frameEncode->data` an array of pointers? You're filling the first index in this call: `memcpy(frameEncode->data[0], data, dataSize);` but in `swr_convert` you're passing as `(const uint8_t **)frame->data`. I believe it should be `&frameEncode->data[0]`. Also, there are two identifiers `frameEncode` and `frame` that looks like multiple typos. Can you verify and update those? Take a look at this [thread](https://stackoverflow.com/questions/39587839/libswresample-swr-convert-not-producing-enough-samples). — Azeem, Apr 10 '20 at 14:30
@Azeem, thanks for your advice, I'll try it tomorrow, also I corrected and updated my answer — easy_breezy, Apr 10 '20 at 16:23
@easy_breezy: No problem! Check this [code snippet](https://godbolt.org/z/c4sKXG). You need to fill in the channels info and test it. One thing which is not clear here is the memory of `frame->data` in call `memcpy(frame->data[0], data, dataSize);`. Its memory should be enough for the contents to be copied from `data`. You need to make sure of that. — Azeem, Apr 11 '20 at 06:54
@Azeem I tried the code from the snippet and it works! Now I succeeded in resampling audio, but the sound after decoding has a very poor quality it's some kind of very illegible robo voice and I don't understand why — easy_breezy, Apr 11 '20 at 21:09
@easy_breezy: Awesome! That's progress! You can tinker with `out_sample_rate` to test it with different rates. I'd suggest converting from the command line first using `ffmpeg` command, evaluate the quality of the output file and then fill in the parameters that you think would be good. See this [thread](https://askubuntu.com/a/239356/784747) and [this](https://stackoverflow.com/questions/4854513/can-ffmpeg-convert-audio-to-raw-pcm-if-so-how) one too. Good luck! — Azeem, Apr 12 '20 at 04:33

Azeem · Accepted Answer · 2020-04-12T13:49:17.493

Here are a few things that you need to take care of:

Make sure that the frame->data[0] contains enough memory (at least equal to dataSize) for the data to be copied from in this call:

memcpy( frame->data[0], data, dataSize );

Also, you need to set frame->nb_samples accordingly. Maybe, you already have but there's no indication in the code that you have posted.

You also need to allocate samples buffer using av_samples_alloc and free it including all the other allocated memory after use so that there won't be any memory leaks.

Here's an example (add the value for out_num_channels):

const int in_sample_rate = 8000;
const int out_sample_rate = 48000;

swrContext = swr_alloc();
av_opt_set_int(swrContext, "in_channel_layout", (int64_t) codecContext->channel_layouts, 0);
av_opt_set_int(swrContext, "out_channel_layout", (int64_t) codecContext->channel_layouts,  0);
av_opt_set_int(swrContext, "in_sample_rate", in_sample_rate, 0);
av_opt_set_int(swrContext, "out_sample_rate", out_sample_rate, 0);
av_opt_set_sample_fmt(swrContext, "in_sample_fmt", AV_SAMPLE_FMT_S16, 0);
av_opt_set_sample_fmt(swrContext, "out_sample_fmt", AV_SAMPLE_FMT_FLTP,  0);
swr_init(swrContext);

memcpy(frame->data[0], data, dataSize); // frame->nb_samples ???

const int out_num_samples = av_rescale_rnd(swr_get_delay(swrContext, in_sample_rate) + frame->nb_samples, out_sample_rate, in_sample_rate, AV_ROUND_UP);

uint8_t* out_samples = NULL;
av_samples_alloc(&out_samples, NULL, out_num_channels, out_num_samples, AV_SAMPLE_FMT_FLTP, 0);

out_num_samples = swr_convert(swrContext, &out_samples, out_num_samples, &frame->data[0], frame->nb_samples);

av_freep(&out_samples);    // free after use
swr_free(&swrContext);     // free after use

You might want to tinker out_sample_rate according to your requirements. I'd suggest converting your file on the command line using ffmpeg command and use the parameters that worked in your code later. The code iterations would be less and you'd have more flexibility working on the command line. See this and this thread on using command line ffmpeg utility.

Hope this helps!

ffmpeg convert AV_SAMPLE_FMT_S16 to AV_SAMPLE_FMT_FLTP in c++

1 Answers1