0

I need to detect silence in PCM audio stream with IMediaSample. Signal is provided by TV and is connected with PC by optic cable to Prodigy 7.1 HiFi soundcard. So far I have this:

bool detectSound(IMediaSample *pSamples)
{
    BYTE *pData;
    pSamples->GetPointer(&pData);
    long size = pSamples->GetActualDataLength();

    long nulls = 0;
    for(long i = 0; i < size; ++i) {
        if(pData[i] == 0)
            ++nulls;
    }

    /* 0.9 to eliminate interference */
    long max_nulls = (long) (0.9 * size);
    if(nulls > max_nulls) {                 /* STOP */
        /* no audio */
        return false;
    }
    else {
        /* audio available */
        return true;
    }
}

The problem is that if I put breakpoint at line marked "STOP", nulls has nearly always the same value and is smaller than max_nulls no matter if I mute TV or not. I noticed that pData[i] values are always 0 or 255. (strange, or not?)

Probably I don't understand what exactly this "data" is and how to interpret it. All I'm sure of is that if there is no audio than all sampled values from waveform should be almost 0.

Could you verify my way of thinking? Thanks in advance.

eclipse

edit:

The problem is somewhere around drivers and AC3 Filter settings, because in "SPDIF Test" I got that 44.1 kHz, 48 kHz and 32 kHz are not supported by DirectSound. Roman's idea is just right and will work when I fix this.

eclipse
  • 693
  • 9
  • 30

1 Answers1

2

The better way is to find out what PCM data is, and the answer to the posted question is going to be trivial.

The quicker way is:

  • treat those audio data bytes as SHORT values (you did not mention, but I suppose your audio is 16 bit)
  • it would be better to split into channels and process separately
  • calculate standard deviation
  • silence is when/if the calculated value is under certain small thresold
Roman R.
  • 68,205
  • 6
  • 94
  • 158
  • I agree with you, but calculated value is again constant in time if muted or not (I treat every two bytes as one 16-bit sample). First of all, what does data buffer returned by IMediaSample contain? Are there 16-bit sample values grabbed from the waveform? This is the basic question for me right now. – eclipse Jul 31 '12 at 10:56
  • The buffer typically contains this data: [PCM Waveform-Audio Data Format](http://msdn.microsoft.com/en-us/library/windows/desktop/dd797880%28v=vs.85%29.aspx#PCM_Waveform-Audio_Data_Format) – Roman R. Jul 31 '12 at 11:04
  • I did like you said: I splitted samples into channels, I used signed short to store 2 bytes per channel sample but it seems that values for real silence are not much different from those with sound (and sound is pretty loud, about halt of TV's scale). Values range from -80 - +80. And I used info from that site you gave me. – eclipse Aug 01 '12 at 10:25
  • So what is the audio media type (`WAVEFORMATEX` structure fields)? I assumed 16-bit, but you should rather specify. – Roman R. Aug 01 '12 at 10:27
  • It is 16. Maybe it is problem related to soundcard drivers. Because it doesn't recognize AC-3 sound (I had to switch to PCM in TV). – eclipse Aug 01 '12 at 10:49
  • With 16 bit PCM, -80..80 range is barely audible silence/noise. – Roman R. Aug 01 '12 at 10:50
  • the problem is that I have this range (usually 0) no matter what real sound is.. the problem must be somewhere else. thanks anyway for your time and help – eclipse Aug 01 '12 at 11:22
  • @RomanR. Hi, I have some similar problem, would you please help me to detect and remove silence, these are my questions [link-1](https://stackoverflow.com/questions/55566814/how-to-convert-audio-byte-to-samples) [link-2](https://stackoverflow.com/questions/55429213/how-to-improve-the-code-to-remove-silent-from-a-recorded-wave-file) and this is the function I want to detect silent on it [ProcessSamples](https://textuploader.com/15i1e) please please please help me I'm so confused – j.doe Apr 13 '19 at 04:25