Audio and DeepSpeech

Question

I tested DeepSpeech for wav files and it work's fine. My problem with deep speech comes when I try using an audio stream it doesn't recognize a single word. The audio stream is PCM 48khz stereo signed 16-bit little endian. I've been trying to convert the stream in other formats, sampleRate and channels with no success at all. I'm using DeepSpeech on nodejs

 modelStream = englishModel.createStream();

    let chunks = [];
    stream.on('data', chunk => {
        chunks.push(chunk);

    }).on('close', () => {
        const buffer = Buffer.concat(chunks);


        let stream = new Duplex();
        stream.push(buffer);
        stream.push(null);
        let audioStream = new MemoryStream();
        stream.pipe(Sox({
            global: {
                'no-dither': true,
            },
            output: {
                bits: 16,
                rate: desiredSampleRate,
                channels: 1,
                encoding: 'signed-integer',
                endian: 'little',
                compression: 0.0,
                type: 'raw'
            }
        })).
        pipe(audioStream);

        audioStream.on('finish', () => {
            let audioBuffer = audioStream.toBuffer();

            const audioLength = (audioBuffer.length / 2) * (1 / desiredSampleRate);
            console.log('audio length', audioLength);

            let result = englishModel.stt(audioBuffer);

            console.log('result:', result);
        });

Please add some code, what have you tried? – Apoorva Chikara Mar 17 '21 at 05:12 — Apoorva Chikara, Mar 17 '21 at 05:12

score 1 · Answer 1 · answered Mar 18 '21 at 14:44

1

You are trying to feed a stream to a method that wants a wav file ...

Use modelStream.feedAudioContent(); instead. Check the example.

answered Mar 18 '21 at 14:44

Olaf

158
7

`modelStream = englishModel.createStream(); stream.on('data', data => { modelStream.feedAudioContent(data); }).on('end', () => { let text = modelStream.finishStream(); console.log(text, 'works'); });` updated my code but deep speech recognition is really off – Angel Mar 18 '21 at 15:39
Why don't you discuss that with the authors? https://github.com/coqui-ai/STT/discussions – Olaf Mar 19 '21 at 09:19

Audio and DeepSpeech

1 Answers1