How to disable disfluency removal for Google Cloud Speech to Text API

Asked Nov 12 '18 at 15:40

Active Jun 21 '19 at 09:58

Viewed 285 times

I am building an app that captures user audio and analyzes disfluency in a reader's speech, so it it important for me to know all forms of disfluency.

I noticed that Google's speech to text cloud API automatically removes disfluencies in speech. For example:

"so uhh, I will probably do that umm probably next week"

Gets transcribed to:

"so I will probably do that probably next week"

Is there a way to keep the uhhs and umms?

asked Nov 12 '18 at 15:40

AspiringMat

2,161
2
21
33

Hello. Did you find any solution? – Liam Park Oct 27 '20 at 00:24
@LouisBelmont I reached out to Google for help but unfortunately it seemed that disfluency removal was part of their trained model.. – AspiringMat Oct 28 '20 at 06:02
1

I also did not find anything like that for Google Speech. The closest I found was for IBM Watson, which has a hesitation and disfluencies marker that appears when disabling the smart formatting option but I have not yet been able to test – Liam Park Oct 29 '20 at 12:09
This process here could be useful https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2836.pdf . They use Google Cloud API to get a transcript. They then use IBM Watson coupled with Gentle forced aligner to get disfluencies which are then combined with the Google transcript. – Smokesick Mar 16 '21 at 17:12

How to disable disfluency removal for Google Cloud Speech to Text API

0 Answers0