I am using a microphone which records sound through a browser, converts it into a file and sends the file to a java server. Then, my java server sends the file to the cloud speech api and gives me the transcription. The problem is that the transcription is super long (around 3.7sec for 2sec of dialog).
So I would like to speed up the transcription. The first thing to do is to stream the data (if I start the transcription at the beginning of the record. The problem is that I don't really understand the api. For instance if I want to transcript my audio stream from the source (browser/microphone) I need to use some kind of JS api, but I can't find anything I can use in a browser (we can't use node like this can we?).
Else I need to stream my data from my js to my java (not sure how to do it without breaking the data...) and then push it through streamingRecognizeFile from there : https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/speech/cloud-client/src/main/java/com/example/speech/Recognize.java
But it takes a file as the input, so how am I supposed to use it? I cannot really tell the system I finished or not the record... How will it understand it is the end of the transcription?
I would like to create something in my web browser just like the google demo there : https://cloud.google.com/speech/
I think there is some fundamental stuff I do not understand about the way to use the streaming api. If someone can explain a bit how I should process about this, it would be owesome.
Thank you.