We are comparing two speech to text services to present pros/cons of each service - with one service we upload a file and check status via a get request - downloading scripts when status returned is done. This allows us to 'fire and forget', frees local resources and we can re-allocate resources when it suits.
We have set up an azure continuous recognition process but are not sure what is going on under the hood. It seems we have to keep a constant connection open while the asr is processing then when it receives some signal of completion (input exhausted) the connection is closed. Not sure if file is uploaded in chunks of data or a continuous stream of data or uploaded in its entirety. Can this ever be fire and forget?
If someone can shed some light on the process or even point to the documentation where more in depth info is available, I'd be much obliged.