Mozilla DeepSpeech is a TensorFlow implementation of Baidu's DeepSpeech architecture.
Questions tagged [mozilla-deepspeech]
103 questions
2
votes
1 answer
Bazel build not respecting my git submodules
I'm running a bazel build in a project that is composed of git submodules,
with the following structure:
/work/
├── tensorflow/ [git submodule]
└── train/
└── DeepSpeech/ [git submodule]
└── native_client/
The build command looks like…

mathematiguy
- 31
- 5
2
votes
0 answers
DeepSpeech: Distinguish speakers
I try to use DeepSpeech for call center, to dictate and summary conversation.
In this scenario, there are always more than one speaker, usually two speakers. Is there any way to distinguish the speakers while DeepSpeech is recognizing the…

jin chong
- 103
- 7
1
vote
0 answers
Install DeepSpeech on Mac M1
At the moment I need to install and run DeepSpeech on my local.
Can someone help me how with installing DeepSpeech on my Mac. I have already installed Python 3.10.0 but it seems like this version do not work with DeepSeech

Hue Minh Nguyen
- 11
- 1
1
vote
1 answer
How can I use trained deepspeech AIs when they are provided trained on a Google Drive?
I want to use the deepspeech model provided here https://github.com/AASHISHAG/deepspeech-german#results. They give trained models to Download and I want to use on of them ( https://drive.google.com/drive/folders/1L7ILB-TMmzL8IDYi_GW8YixAoYWjDMn1 )…

Dennis
- 11
- 1
1
vote
0 answers
Deepspeech realtime speech to text
How can I do real-time speech to text using deep speech and a microphone?
I tried running this script I found on GitHub, but when I run it and I do not say anything for a while, it starts printing random text.
import pyaudio
import deepspeech
import…

Kevin 29890
- 23
- 3
1
vote
0 answers
Deepspeech does not recognize input audio file recorded from PC microphone
trying to convert audio to text using DeepSpeech, it works fine with the default audio files from Mozilla/DeepSpeech. but when i try to record audio from my PC's microphone and feed it to the model, it raises an error( 'wave.Error: unknown format:…

Jaafar Alshall
- 11
- 1
1
vote
0 answers
How can I integrate Optuna with Deepspeech training?
I'm trying to integrate Optuna with DeepSpeech in order to optimise some of its hyperparameters. I'm sticking to learning rate for now, just to get a feel for how Optuna works, but I've hit a roadblock and need some help.
I have a function hps_train…

jayathungek
- 11
- 1
1
vote
0 answers
Problem with opening DeepSpeech model: "E/tflite: Could not open '/storage/emulated/0/Download/deepspeech-0.9.3-models.tflite'."
I'am trying to run a tflite model with DeepSpeech Java API on the Android (AVD - Pixel 2, API 30). I have encountered a problem during creation of DeepSpeechModel object. I don't have any idea what it could be.
@Override
protected void…

JD99
- 11
- 1
1
vote
1 answer
while I was trying to train a DeepSpeech model on google colab, I'm getting an error saying that .whl file is not suported
commands i used
!wget https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/ds_ctcdecoder-0.9.3-cp36-cp36m-manylinux1_x86_64.whl
!pip install /content/~path~/ds_ctcdecoder-0.9.3-cp36-cp36m-manylinux1_x86_64.whl
this gives me an…

chamod rathnayake
- 775
- 8
- 15
1
vote
0 answers
how to pass chunk of audio files to Mozilla DeepSpeech web socket?
when a live speech is going on, I want to separate it into a chunk of short mp3 files and send it to a Mozilla DeepSpeech WebSocket for transcribing
(speech is conducted using the device microphone)

chamod rathnayake
- 775
- 8
- 15
1
vote
1 answer
Not able to train the Deepspeech model on Windows
I have tried to train Deepspeech model on Windows, as I can not use Linux. But, I am not able to train I am getting error
File "E:/deepspeech-german-master/DeepSpeech/training/deepspeech_training/train.py", line 30, in
from…

swati sharma
- 11
- 4
1
vote
0 answers
Python for generating Timestamps for a manually transcribed .wav file
I am trying to automate the generation of timestamps for speech and silences in .wav files.
My Input:
Multiple .wav files with speech in English.
All these .wav files have already been manually transcribed.
My Goal:
To generate timestamps for the…

Varun S
- 587
- 4
- 12
- 25
1
vote
1 answer
Build TFLite without ruy for android
I'm trying to analyze DeepSpeech's (a third-party library that uses TensorFlow and TFLite) performance on android devices and had built it successfully as they mentioned in their docs.
After I read the source codes, I found out that tensorflow uses…

user9886
- 13
- 2
1
vote
1 answer
Doesn’t look like a character based (Bytes Are All You Need) model (DeepSpeech)
I have been following DeepSpeech documentation in order to build my own scorer. After implementing this blocks of code
cd data/lm
python3 generate_lm.py --input_txt vocabulary.txt --output_dir .
–top_k 1500 --kenlm_bins…

VladH
- 143
- 7
1
vote
1 answer
Extremely large loss and wrong transcript after training
I am currently implementing DeepSpeech for my language. I have 2 directories: train and test. Train has approximately 15000 wavs and test approximately 3000. The problem that I face is during training I have large losses, and for the test part loss…

VladH
- 143
- 7