Questions tagged [speech-to-text]

The translation of spoken words into text. Possible synonyms include automatic speech recognition, ASR, computer speech recognition, speech to text, STT.

2372 questions
0
votes
1 answer

How do I make the engine.say() work in my code?

I found an offline speech recognizer and tried to integrate a pyttsx3 module so that the word that I speak is also spoken by the pyttsx3. Basically it is a speech to text to speech code, but the engine.say does not seem to work. What should I…
0
votes
0 answers

Wav2Vec2 kernel size bigger than padding

I'm trying to use pretrained Wav2Vec2 model for speech recognition in my language. Got it from: https://huggingface.co/facebook/wav2vec2-base-10k-voxpopuli-ft-pl Note: I'm working in jupyter notebook. First I read the .wav files: from…
flis00
  • 37
  • 6
0
votes
0 answers

Faster speech recognition using python

I have been making a speech recognition and here is my code- `import speech_recognition as sr def listen(): r = sr.Recognizer() with sr.Microphone() as source: print("Listening...") r.pause_threshold = 0.9 audio =…
0
votes
0 answers

AttributeError: 'kid_learning' object has no attribute 'MyText' in Tkinter app while using speech recognition

im trying to build a small project with speech to text and text to speech functionality. The speech to text function has an attribute error, this works fine when you run that block separately though, any help would be highly appreciated. Please…
0
votes
0 answers

Java Microsoft Azure speech-to-text from bytes?

I am thinking about building the code more efficiently. I am using Discord JDA and the Microsoft Azure speech service. Is it possible to recognize speech directly from bytes, not from a file? I mean, skipping writing bytes to a temporary file and…
Adixe
  • 119
  • 7
0
votes
0 answers

microsoft azure speech-to-text truncate the audio

I'm using this guide https://www.loonskai.com/blog/telegram-speech-to-text-bot-with-nodejs#microsoft-azure-speech-recognition to build a bot to send me back the text from the azure, everything works fine but for some audio(I'm seeing that happens…
0
votes
0 answers

Making a message in Android Studio under a condition

i would like to make a message under a condition if the pronounciation is correct according to google assistant, if my speaking is correct then there will be a message sayin' that my pronounciation is correct btnPlay.setOnClickListener(new…
0
votes
1 answer

How to convert user voice input into text and store in the excel sheet?

Dear All, I wrote a simple program of data entry into excel using python. But I task is that this manual entry will be performed from user voice input. Can anybody help me in this regard. import PySimpleGUI as sg import pandas as pd import…
0
votes
0 answers

Can Deepspeech confidence be used to calculate accuracy?

In Deepspeech documentation, definition of confidence is: Confidence is roughly the sum of the acoustic model logit values for each timestep/character that contributed to the creation of this transcription. But on running on different audios,…
0
votes
2 answers

Azure Cognitive Services / Speech-to-text: Transcribe compressed PCMU (mu-law) wav files

Using Azure Speech Service, I'm trying to transcribe a bunch a wav files (compressed in the PCMU aka mu-law format). I came up with the following code based on the articles referenced below. The code works fine sometimes with few files, but I keep…
0
votes
0 answers

Schedule/manage loaded AI models

I am working with multiple character-specific voices deployed on a Triton instance. The resources are not enough to have all loaded simultaneously. Currently I manually trigger a model load/unload each time a request is received by the service. How…
0
votes
1 answer

Enable keyboard Search key when Speech to Text ends in map searchBar

When activating the map searchbar and typing in a town name the ‘search’ key activates and goes blue, if however you enter data using speech-to-text, the button is grey. If you touch the button the ‘keyboard’ click can be heard but the button does…
Andy
  • 107
  • 1
  • 8
0
votes
0 answers

AddEventListener method is not passing the recognized text through the attachEvent.success()

In the flutter, I am trying to use Microsoft continuous speech to text transcription for my app. In the android java code, I tried the api calling, I got the recognizing and recognized text in the console. But on passing that text to the client side…
0
votes
1 answer

Can speech diarization be be integrated with deepspeech?

In an online meeting such as Google Meet/ Zoom, I want to detect change of speaker and then transcribe the audio for different speakers. I am using Deepspeech model for speech to text. I have fine-tuned the model for Indian accent english but I want…
0
votes
1 answer

How can I bridge a Rasa chatbot with TTS & STT modules?

I recently started using the RASA framework for developing chatbots. My goal is to create something that once deployed on a cloud VM, it can interface with voice modules so that it can be more easily accessible (no typing required from the user). Do…
Zack
  • 181
  • 1
  • 11