I'm using Google Speech To Text api to transcribe streaming audio. I've been inputting key words to help train and make the api more accurate. It's still not great (I'm streaming police radio traffic). Is there a way to create my own model? I'm thinking I would be able to pass in recorded clips, and manually transcribe it to help train a custom model?
Asked
Active
Viewed 119 times
1 Answers
0
You can use "Speech adaptation" technique provided by Google. Context Set can be provided in your recognition request if following manner:
{"phrases": 'Brooklyn Bridge, "boost": 20.0} Boost. This value increases the probability that a specific phrase will be recognized over other similar sounding phrases. The higher the boost, the higher the chance of false positive recognition as well. Can accept wide range of positive values. Most use cases are best served with values between 0 and 20. Using a binary search approach may help you find the optimal value. For more info : https://cloud.google.com/speech-to-text/docs/context-strength

Rishabh Gupta
- 97
- 3