1

I have a very strange problem. I am working on a voice command classification application in python, and I am using sounddevice library to record the audio signal. If I record 10 seconds of signal, it looks something like this:

Waveform graph of silence and one word in the middle (don't mind the red and yellow lines).

However, if I also import tensorflow.keras in the program, and record the same signal, there is a sudden peak at the beginning and the graph changes to this. I can also hear the short 'beep' noise after playing back the signal.

Waveform graph after including 'import tensorflow.keras.' statement.

I have replicated this many times, and I am sure that no other library is causing this. Do you have any idea why this could be happening? It's not such a big problem in the application, I can work around it, but I'm just very curious and confused, as this problem is very strange to me.

The code:

import sounddevice as sd
from digit_spotting_service import * 
# tensorflow.keras is imported inside (at the top) of this module
from sif_splitting_service import *

SAMPLE_RATE = 22050

if __name__ == "__main__":

    sd.default.channels = 1
    sd.default.samplerate = SAMPLE_RATE

    print("Recording...")
    signal = sd.rec(SAMPLE_RATE * 10)
    sd.wait()
    print("Stopped")

    signal = signal[:, 0]

    sss = SifSplittingService() 
    dss = DigitSpottingService()

    # "sif" stands for silence-isolated-frame

    sifs = sss.split(signal, SAMPLE_RATE)
    sss.visualize()

    for sif in sifs:
        digit = dss.predict(sif, SAMPLE_RATE)
        print("I think that the digit is {}.".format(digit))


  • How does the code for recording look like? – Jon Nordby Feb 18 '21 at 09:14
  • Are you importing tensorflow.keras at the top of the module, that is _before_ you have started recording? – Jon Nordby Feb 18 '21 at 09:15
  • I have updated the question to include the relevant code. I'm importing keras indirectly, from another file, but before I start the recording. Even if I import it in the same file in which I make the recording, it behaves the same. – martinkarlik Feb 18 '21 at 18:29
  • Does the problem also occur if dss.predict() is not called? inside the loop – Jon Nordby Feb 19 '21 at 13:46
  • Yes, even if I leave only the import statements and the code for recording and playback (I realized that I copied the code here a bit wrong, I forgot a line where I initialize 'dss', but it doesn't change anything). Eventually I want to integrate this into a real-time audio processing application, so the beginning of the recording doesn't matter much, but it's still strange. – martinkarlik Feb 20 '21 at 09:47

0 Answers0