0

I have 2 or more audio frames structured as:

   int sample_rate;    // The sample-rate of this buffer (48000 or 44100 normaly)
   int no_channels;    // The number of audio channels
   int no_samples;     // The number of audio samples per channel (n elements in data array)
   float* p_data;      // The audio data

to add 2 audio buffers togeder is quite simple: frameInput1; frameInput2;

    frameOutput.sample_rate = 48000;
    frameOutput.no_channel = 1;
    frameOutput.no_sample = 1000;
    frameOutput.p_data = (float*)malloc(frameOutput.no_sample * frameOutput.no_channel * sizeof(float))

    for(int i=0; i<frameOutput.no_sample; i++){
         frameOutput.p_data[i] = frameInput1.p_data[i] + frameInput2.p_data[i];
     }

I created an audiobuffer with same sample, and I added the input frame for every sample in the data array

but if I have audio buffer with different no_sample, or different sample_rate?

for example:

     input1.sample_rate = 48000hz; input1.no_sample = 1000 ;
     input2.sample_rate = 44100hz; input2.no_sample = 600 ;

how do I add this two inputs?

2 Answers2

0

Just scale the address in the buffer in respect to sample_rate:

float in1_rate_scale = float(input1.sample_rate) / frameOutput.sample_rate;
float in2_rate_scale = float(input2.sample_rate) / frameOutput.sample_rate;

for (int i = 0; i < frameOutput.no_sample; i++) {
    frameOutput.p_data[i] = frameInput1.p_data[i*in1_rate_scale] + frameInput2.p_data[i * in2_rate_scale];
}

Anyway, remember, that just adding "volume" values is wrong, you will easy get to overflow when loudness in both buffers will be at maximum. But this is another question and another problem you have ahead.

Damir Tenishev
  • 1,275
  • 3
  • 14
  • Overflow is less of a problem if the values are stored in floating-point format, as floating-point values don't overflow unless/until they get extremely/unreasonably large. – Jeremy Friesner Sep 04 '21 at 23:47
  • That's particularly true, thanks. Anyway, pure addition of audio signals is not correct in most cases. Again, this is outside the scope of this question. – Damir Tenishev Sep 05 '21 at 00:13
  • that's correct, but the problem is that not only the sample rate is different, but also the number of sample, so the memory allocated for different frame buffer has different dimension – cerutti davide Sep 06 '21 at 09:48
  • @ceruttidavide, well, my answer is based on the input available in your question and it should be enough to solve the particular question you asked. If you looking for more detailed answer it would help if you provide a [Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example) so that people can help. – Damir Tenishev Sep 06 '21 at 19:13
0

I'm assuming your program is processing two audio streams, and each stream is providing you with a series of audio buffers.

If so, then the number of frames of audio in each buffer isn't a fundamental characteristic of the audio, rather it is just a side effect of how the audio samples were packaged together (e.g. the producer of stream A decided to put 1000 samples together into a single buffer, while the producer of stream B decided to put just 600 samples together).

Ideally you could tell both of your stream-producers to give you audio buffers with a fixed (and equal) number of frames in them, so that you could just add the samples together verbatim, but if you can't get them to do that, then you'll need to implement some kind of buffering mechanism, where you hold the "extra" frames from the larger of the two buffers in some kind of FIFO queue and then use them as the first samples in your next mixing operation. That can get a little bit complicated, so unless performance is your primary concern, I suggest just keeping a FIFO queue of audio frames (e.g. a std::dequeue<float> or similar) for each input-stream, and always pushing all of the newly-received audio frames from that input-streams to the tail of that FIFO queue, and then popping frames from the head of each FIFO queue as necessary when you need to mix audio together. That way you decouple the mixing of audio from the size of the input audio buffers, so that your mixing code will work regardless of what the input streams produce for you. (note that the maximum size of the output/mixed audio-buffer than you can produce will be equal to the number of audio frames in your shortest FIFO queue at that time)

Handling different sample rates is a more difficult problem to solve, especially if you want your output audio to have decent sound quality. To handle it properly, you'll need to use a sample-rate-converter algorithm (such as libsamplerate) to convert one of the streams' sample-rate to be equal to the sample-rate of the other one (or if you prefer, to convert both streams' sample-rate to be equal to the sample-rate of your output stream). Once you've done that, then you can add the two matched-rate streams together sample-by-sample, as before.

Jeremy Friesner
  • 70,199
  • 15
  • 131
  • 234
  • so the problem is that I'm real time mixing audio, and performance is absolute the key of the project, also the audio-stream that I will recive are not under my control, they are generate by others programs and I'm collecting them by ASIO networking – cerutti davide Sep 06 '21 at 09:45