Questions tagged [mozilla-deepspeech]

Mozilla DeepSpeech is a TensorFlow implementation of Baidu's DeepSpeech architecture.

Code: https://github.com/mozilla/DeepSpeech

103 questions
2
votes
1 answer

Bazel build not respecting my git submodules

I'm running a bazel build in a project that is composed of git submodules, with the following structure: /work/ ├── tensorflow/ [git submodule] └── train/ └── DeepSpeech/ [git submodule] └── native_client/ The build command looks like…
2
votes
0 answers

DeepSpeech: Distinguish speakers

I try to use DeepSpeech for call center, to dictate and summary conversation. In this scenario, there are always more than one speaker, usually two speakers. Is there any way to distinguish the speakers while DeepSpeech is recognizing the…
jin chong
  • 103
  • 7
1
vote
0 answers

Install DeepSpeech on Mac M1

At the moment I need to install and run DeepSpeech on my local. Can someone help me how with installing DeepSpeech on my Mac. I have already installed Python 3.10.0 but it seems like this version do not work with DeepSeech
1
vote
1 answer

How can I use trained deepspeech AIs when they are provided trained on a Google Drive?

I want to use the deepspeech model provided here https://github.com/AASHISHAG/deepspeech-german#results. They give trained models to Download and I want to use on of them ( https://drive.google.com/drive/folders/1L7ILB-TMmzL8IDYi_GW8YixAoYWjDMn1 )…
Dennis
  • 11
  • 1
1
vote
0 answers

Deepspeech realtime speech to text

How can I do real-time speech to text using deep speech and a microphone? I tried running this script I found on GitHub, but when I run it and I do not say anything for a while, it starts printing random text. import pyaudio import deepspeech import…
1
vote
0 answers

Deepspeech does not recognize input audio file recorded from PC microphone

trying to convert audio to text using DeepSpeech, it works fine with the default audio files from Mozilla/DeepSpeech. but when i try to record audio from my PC's microphone and feed it to the model, it raises an error( 'wave.Error: unknown format:…
1
vote
0 answers

How can I integrate Optuna with Deepspeech training?

I'm trying to integrate Optuna with DeepSpeech in order to optimise some of its hyperparameters. I'm sticking to learning rate for now, just to get a feel for how Optuna works, but I've hit a roadblock and need some help. I have a function hps_train…
1
vote
0 answers

Problem with opening DeepSpeech model: "E/tflite: Could not open '/storage/emulated/0/Download/deepspeech-0.9.3-models.tflite'."

I'am trying to run a tflite model with DeepSpeech Java API on the Android (AVD - Pixel 2, API 30). I have encountered a problem during creation of DeepSpeechModel object. I don't have any idea what it could be. @Override protected void…
JD99
  • 11
  • 1
1
vote
1 answer

while I was trying to train a DeepSpeech model on google colab, I'm getting an error saying that .whl file is not suported

commands i used !wget https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/ds_ctcdecoder-0.9.3-cp36-cp36m-manylinux1_x86_64.whl !pip install /content/~path~/ds_ctcdecoder-0.9.3-cp36-cp36m-manylinux1_x86_64.whl this gives me an…
1
vote
0 answers

how to pass chunk of audio files to Mozilla DeepSpeech web socket?

when a live speech is going on, I want to separate it into a chunk of short mp3 files and send it to a Mozilla DeepSpeech WebSocket for transcribing (speech is conducted using the device microphone)
1
vote
1 answer

Not able to train the Deepspeech model on Windows

I have tried to train Deepspeech model on Windows, as I can not use Linux. But, I am not able to train I am getting error File "E:/deepspeech-german-master/DeepSpeech/training/deepspeech_training/train.py", line 30, in from…
1
vote
0 answers

Python for generating Timestamps for a manually transcribed .wav file

I am trying to automate the generation of timestamps for speech and silences in .wav files. My Input: Multiple .wav files with speech in English. All these .wav files have already been manually transcribed. My Goal: To generate timestamps for the…
1
vote
1 answer

Build TFLite without ruy for android

I'm trying to analyze DeepSpeech's (a third-party library that uses TensorFlow and TFLite) performance on android devices and had built it successfully as they mentioned in their docs. After I read the source codes, I found out that tensorflow uses…
1
vote
1 answer

Doesn’t look like a character based (Bytes Are All You Need) model (DeepSpeech)

I have been following DeepSpeech documentation in order to build my own scorer. After implementing this blocks of code cd data/lm python3 generate_lm.py --input_txt vocabulary.txt --output_dir . –top_k 1500 --kenlm_bins…
1
vote
1 answer

Extremely large loss and wrong transcript after training

I am currently implementing DeepSpeech for my language. I have 2 directories: train and test. Train has approximately 15000 wavs and test approximately 3000. The problem that I face is during training I have large losses, and for the test part loss…
VladH
  • 143
  • 7