0

I'm using the PulseAudio API to get the current microphone input in "realtime". The buffer data is being delivered as a 16bit little endian byte array. What I'd like to do is to find out the maximum peak level in the buffer and transform it into a decibel value. To do that I have to convert each two byte array values into one integer value. In the same loop-process I am also looking for the max value. After that I convert the maximum value into a decibel value. Here is the C code:

static ssize_t loop_write(int fd, const uint8_t *data, size_t size) 
{
int newsize = size / 2;
uint16_t max_value = 0;
int i = 0;

for (i = 0; i < size; i += 2)
{
    // put two bytes into one integer
    uint16_t val = data[i] + ((uint32_t)data[i+1] << 8);

    // find max value
    if(val > max_value)
       max_value = val;
}

// convert to decibel
float decibel = max_value / pow(2, 15);

if(decibel != 0)
    decibel = 20 * log(decibel);

// print result
printf("%f, ", decibel);

return size;
}

To my knowledge the amplitude value should be between 0 and 32768 for PA_SAMPLE_S16LE. But I am getting values between 0 and 65536 before the decibel conversion. Is there anything wrong with my conversion?

For the sake of completeness I am also posting my pulseaudio setup:

int main(int argc, char*argv[]) 
{
char *device = "alsa_input.usb-041e_30d3_121023000184-00-U0x41e0x30d3.analog-mono";

// The sample type to use
static const pa_sample_spec ss = {
    .format = PA_SAMPLE_S16LE,
    .rate = 44100,
    .channels = 1
};
pa_simple *s = NULL;
int ret = 1;
int error;

// Create the recording stream 
if (!(s = pa_simple_new(NULL, argv[0], PA_STREAM_RECORD, device, "record", &ss, NULL, NULL, &error))) {
    fprintf(stderr, __FILE__": pa_simple_new() failed: %s\n", pa_strerror(error));
    goto finish;
}

for (;;) {
    uint8_t buf[BUFSIZE];

    // Record some data ...
    if (pa_simple_read(s, buf, sizeof(buf), &error) < 0) {
        fprintf(stderr, __FILE__": pa_simple_read() failed: %s\n", pa_strerror(error));
        goto finish;
    }

    // And write it to STDOUT
    if (loop_write(STDOUT_FILENO, buf, sizeof(buf)) != sizeof(buf)) {
        fprintf(stderr, __FILE__": write() failed: %s\n", strerror(errno));
        goto finish;
    }
}

ret = 0;

finish:

if (s)
    pa_simple_free(s);

return 0;
}
BenMorel
  • 34,448
  • 50
  • 182
  • 322
Dominik Schreiber
  • 769
  • 2
  • 15
  • 25

2 Answers2

7

What I'd like to do is to find out the maximum peak level in the buffer and transform it into a decibel value.

From a physical point of view this approach doesn't makes sense. While it is possible to specify single sample values in relation to the full dynamic range, you're probably more interested in the sound level, i.e. the power of the signal. A single peak, even if it's full scale carries only very little energy; it may cause a very loud popping noise, due to harmonic distortion and limited bandwidth, but technically its power density is spread out over the whole band limited spectrum.

What you really should to is determining the RMS value (root mean square). I.e.

RMS = sqrt( sum( square(samples) )/n_samples )

EDIT: Note that the above is only correct for signals without a DC part. Most analog sound interfaces are AC coupled, so this is not a problem. But if there's a DC part as well, you must first subtract the mean value from the samples, i.e.

RMS_DC_reject = sqrt( sum( square(samples - mean_sample) )/n_samples )

I'll leave it as an exercise for the reader to add this to the code below.

This gives you the power of the samples processed, which is what you actually want. You asked about deciBels. Now I have to ask you dB(what)? You need reference value since Bels (or deciBels) is a relative (i.e. comparative) measure. For a digital signal full scale would be 0 dB(FS) and zero line would be -20 log10( 2^B ), where B = sampling bit depth. For a 16 bit signal about -96 dB(FS).

If we're talking about signal on the line, a common reference is a power 1 mW, in that case the scale is dB(m). For audio line level it has been defined that full scale equals to 1 mW of signal power, which is what 1V RMS dissipate over a 1 kOhm resistor (There you have the RMS again).

Now since our full scale is immediately determined by the input circuitry, which is defined in terms of dB(m), you can later display dB(FS) as dB(m) (or dBm) just fine.

When it comes to the actual sound level, well, this depends on your input amplifier gain, and the conversion efficiency of the microphone used.


To my knowledge the amplitude value should be between 0 and 32768 for PA_SAMPLE_S16LE. But I am getting values between 0 and 65536 before the decibel conversion. Is there anything wrong with my conversion?

You asked about a signed integer format. But you're casting the values into an unsigned int. And since dB_FS is relative to the full scale, don't divide it by the number of bits. For a zero signal of a 16 bit the outcome should be about -96 dB. The division makes no sense anyway, as it merely scales your RMS into the range [0; 1], but log(0) diverges to -infinity. Hence your if statement. But remember, this is physics, and physics is continuous, there should be no if statement here.

You should write it like this

// even for signed values this should be 2^N
// we're going to deal with signed later
double const MAX_SIGNAL = 1 << SAMPLE_BITS;

// using double here, because float offers only 25 bits of
// distortion free dynamic range.
double accum = 0;
int const n_samples = size/2;
for (i = 0; i < size; i += 2)
{
    // put two bytes into one __signed__ integer
    int16_t val = data[i] + ((int16_t)data[i+1] << 8);

    accum += val*val;
}
accum /= n_samples;

// Since we're using signed values we need to
// double the accumulation; of course this could be
// contracted into the statement above
accum *= 2.;

float const dB_FS = -20 * log10( MAX_SIGNAL - sqrt(accum) );
BenMorel
  • 34,448
  • 50
  • 182
  • 322
datenwolf
  • 159,371
  • 13
  • 185
  • 298
  • Thanks for you comprehensive answer/explanation datenwolf! I think db FS is what I'm looking for. RMS measures the perceived volume for humans, right? But I'm explicitely looking for Clipping because down the line I'd like to recognize when the signal coming into the preamp is too "hot" and if necessary turn down the amplification of the preamp using alsamixer to prevent further clipping. – Dominik Schreiber Feb 28 '13 at 12:55
  • If you want to look for clipping, then don't do a dB conversion, just check if samples hit the limits of the value range. However since your ultimate goal is automatic volume adjustment, take a page out of experienced audio engineers' book (by which I mean everything written before 1980): You should adjust your gain in a way, that the RMS of the signal is about -20dB(FS) to -25dB(FS), this gives you ample headroom. The problem with mere clip detection is, that you simply don't know, how much power the signal actually carries, so you'd have to scale down the gain, "clip by clip"; not good. – datenwolf Feb 28 '13 at 13:43
  • @DominikSchreiber: Why everything before 1980? Because since the 1980ies, things became digital and the Loudness Wars began, when audio engineers started to use peak normalization, i.e. scale their digital signal so that the strongest peak would be full scale. The very first CD-DA mastering station by Sony still featured a fine grained level meter around -20 dB(FS) to which the RMS of the signal was to be adjusted. Later versions didn't. – datenwolf Feb 28 '13 at 13:45
  • ok, going to take that advice. I think I am running into another problem. While recording every dB_FS value is -96.329597. According to your description this equals no signal. But this is probably to pulseaudio specific. Going to open another thread for that. Thanks again for your kind help! – Dominik Schreiber Feb 28 '13 at 13:49
0

According to the PulseAudio Simple API:

Using the connection is very similar to the normal read() and write() system calls. The main difference is that they're called pa_simple_read() and pa_simple_write(). Note that these operations always block.

This seems to imply that the return values are very similar, as there seems to be no other mention of pa_simple_read's return value in any sensible places. Here's what opengroup's read() manual says:

Upon successful completion, read() ... shall return a non-negative integer indicating the number of bytes actually read.

Supposing pa_simple_read returns a value less than sizeof buffer, your loop_write function would be using uninitialised values. That's undefined behaviour. I suggest storing the return value of pa_simple_read, and passing it to loop_write in place of sizeof(buf) after you've checked for errors.

Supposing the value passed to pa_simple_read is an odd number, your loop_write would be using an uninitialised value in the last iteration. Perhaps, to counter this, you could change your loop to: for (i = 1; i < size; i += 2) and your val declaration/initialisation to: uint16_t val = data[i-1] + ((uint32_t)data[i] << 8);

I'd like to express thanks to mtrw for helping me come to this conclusion.

autistic
  • 1
  • 3
  • 35
  • 80
  • -1 - OP says the samples are PA_SAMPLE_S16LE - signed 16-bit little-endian integers. Seems well documented, in my opinion. – mtrw Feb 28 '13 at 12:18
  • What can I do to improve this answer? – autistic Feb 28 '13 at 12:18
  • @mtrw Which part of this answer are you referring to? Please show me the part of the documentation that specifies exactly how much pa_simple_read will assign into buffer. – autistic Feb 28 '13 at 12:20
  • The initialization code says use PA_SIMPLE_S16LE, so I don't see how pa_simple_read can return an odd number of bytes. – mtrw Feb 28 '13 at 12:25
  • @mtrw: Suppose you pass in an array of SIZE_MAX bytes, will pa_simple_read assign SIZE_MAX bytes to that array? If not, then how do you determine how many bytes it actually assigned? – autistic Feb 28 '13 at 12:26
  • @mtrw: Suppose `sizeof buffer` is 1. Will pa_simple_read assign 0 bytes in order to keep the number even? Suppose `sizeof buffer` is some other odd number. Will pa_simple_read only assign an even number of bytes to the array? Where is this documented? – autistic Feb 28 '13 at 12:28
  • It's a user error to specify an odd-sized buffer if you've asked for 16-bit samples, isn't it? Granted, this may not be documented (although I'm a pretty casual user of PA so maybe I just haven't come across it), but it seems reasonably clear from context. – mtrw Feb 28 '13 at 12:35
  • @mtrw That's an assumption. Programmers who make assumptions rather than relying upon facts as documented are more likely to write buggy code. Please prove to me that pa_simple_read is indeed assigning `sizeof buffer` bytes into buffer. If you can't definitively state that I'm not helping, then how can you justify a down vote? – autistic Feb 28 '13 at 12:39
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/25286/discussion-between-mtrw-and-modifiable-lvalue) – mtrw Feb 28 '13 at 12:42