Questions tagged [google-speech-api]

With Google Speech API you can convert speech to text file based or live stream

The https://cloud.google.com/speech/ and is part of https://cloud.google.com/products/ to allow for speech to text conversion.

When using a live stream words are returned almost realtime. This is limited by 1 minute or when paused for ~1 second. When using an asynchronous file the speech to text can be as long as 80 minutes. See https://cloud.google.com/speech/limits

For more see https://cloud.google.com/speech/docs/

837 questions
0
votes
2 answers

Can I use curl to make a speech:recognize (Google Cloud Speech-to-text) request using a .json from Google Cloud Storage?

I'd like to be able to make a speech:recognize request on and with my own cloud-hosted resources, so I can simply log into the Google Cloud Platform console, run a command in the Cloud Shell, and see the results. Much like…
0
votes
1 answer

How to show Word Level Confidence Score in Google Speech API

I have included the Google Speech API in Cloud Functions. I want to get the word level confidence score, so I set 'enableWordConfidence' to true. For some reason, the response doesn't return a confidence score on a word level. I have tried de-DE…
bobski
  • 152
  • 3
  • 14
0
votes
1 answer

How to get entire transcript using google.cloud.speech_v1p1beta1?

Using Google-Speech-to-Text, I only get partial transcription. Input file: from google sample audio file Link to google repo location commercial_mono.wav Here is my code: def transcribe_gcs(gcs_uri): from google.cloud import speech_v1p1beta1 as…
0
votes
0 answers

Google Speech Recognition API: returns null result

I use Google speech recognition API. When I'm trying to recognize relatively short words (like "yes" or "no") with duration between 0.25-0.5 seconds, Google API often returns NULL. I tried other input data formats and solution posted here (16-bit…
Sergey
  • 1
  • 2
0
votes
1 answer

GoogleSpeechAPI (C#) Detecting language spoken automatically

I've installed GoogleSpeechAPI 1.1.0-beta03 (C#) and tried to implement new functionality: Detecting language spoken automatically But RecognitionConfig class does not have alternativeLanguageCodes property? Is this not available yet for C# client…
sabiland
  • 2,526
  • 1
  • 25
  • 24
0
votes
1 answer

How to perform real-time speech recognition | Google Cloud Speech-to-Text

I'm trying to transcribe audio from my speakers I'm piping sound from speakers to node.js file (https://askubuntu.com/a/850174) parec -d alsa_output.pci-0000_00_1b.0.analog-stereo.monitor --rate=16000 --channels=1 | node transcribe.js This is my…
0
votes
0 answers

how to play Google speech to text api buffered audio in AVAudioPlayer swift

I am using google speech to text app, for converting speech into text. speech is buffered in AudioBufferList and further converted into Data. when I am trying to play same Data in AvudioPlayer using its init method "AVAudioPlayer(contentOf: Data)"…
0
votes
0 answers

Google STT with Pandas - how to transpose columns

I am new to Python, and working on a project with Google Speech to text. Finally figured out how to import results of Google STT (JSON) and format data in csv. BUT.... Google gives you alternative words which is good and bad. The attached code will…
TinkyWinkyMD
  • 35
  • 1
  • 7
0
votes
1 answer

Objective-C Async call for Google Long Running Speech API is not returning Operation status true?

I am having issue when using Google asynchronous speech recognition long running API. The operation.done is not returning true. I had modified the objective-C sample program…
0
votes
1 answer

Audio encoding, sample rate, and re-encoding in Google Cloud

Is it possible to lookup the audio metadata for a file stored in Google Cloud without having to download it? When building a Google Speech-to-Text API service you pass it a gs://bucket/file.flac, and I know the sox and ffmpeg bash and Python…
libroman2
  • 1
  • 3
0
votes
1 answer

Google Speech api output changes every time for the same

Google Speech API output changes every time for the same audio file. Is there a way to get same output or fix the model the transcriber uses?
0
votes
1 answer

Cannot find audio file in google bucket with google speech API

With the Google Speech API (using the python sample code), you need to have your audio files on google cloud when longer than 1 minute. According to some sample code, you can use a path like gs://python-docs-samples-tests/speech/audio.flac. So I…
0
votes
1 answer

(Google Speech API) What is frame size?

The Google Speech to Text documentation recommends using a 100 ms frame size to minimize latency. Any frame size is acceptable. Larger frames are more efficient, but add latency. A 100-millisecond frame size is recommended as a good tradeoff between…
0
votes
0 answers

How to correctly use StreamingRecognizeRequest from http audio stream

Why am I not getting any responses back (from StreamingRecognizeRequests) in my responseObserver.onResponse() method below? Stuck as to how to proceed further on this one. Code (which is simplified to highlight the issue as a modification to the…
0
votes
1 answer

Cloud Speech API streaming mode recognition more than 1 min

I am trying to do real time speech recognition for more than 1 min using Cloud Speech API but the limit of synchronous speech recognition is just 1 min per request. I have tried running…