Azure Text to Speech API - Limited to 10 Minutes of Audio?

Question

Is there a limit to the amount of text which can be submitted to the TTS (neural) Speech Service endpoints?

All of the requests I'm making from an Azure Function are successful but have a cutoff at 10 minutes exactly.

Ali Heikal · Accepted Answer · 2019-02-16T22:16:34.687

1

Yes, it is stated in the old Bing Speech API documentation that the Speech Service places limitations on the duration of the WebSocket connections to the service with a maximum duration of 10 minutes for active WebSocket connection and a maximum of 180 seconds for inactive.

UPDATE

It is also stated in the new Speech Service documentation that an access token is valid for 10 minutes.

edited Feb 16 '19 at 22:16

answered Feb 16 '19 at 22:03

Ali Heikal

3,790
3
18
24

score 0 · Answer 2 · answered Feb 16 '19 at 03:01

0

If you are using javascript from the docs

JvaScript service wrapper for Microsoft Speech API. It is an implementation of the Speech Websocket API specifically, which supports long speech recognition up to 10 minutes in length.

answered Feb 16 '19 at 03:01

Sajeetharan

216,225
63
350
396

score 0 · Answer 3 · answered Apr 13 '23 at 12:48

TTS documentation says: Asynchronous synthesis of long audio: Use the batch synthesis API (Preview) to asynchronously synthesize text-to-speech files longer than 10 minutes.

Batch synthesis API documentation says: The Batch synthesis API ... can synthesize a large volume of text input (long and short) asynchronously... create synthesized audio longer than 10 minutes.

So I believe it implies that the synchronous TTS API can handle only up to 10 minutes of audio. In my case, TTSing long text gave me HTTP status code 200 with the response being send via chunked transfer encoding, and after like 10s it failed on System.Net.Http.HttpRequestException: Error while copying content to a stream. ---> System.IO.IOException: The response ended prematurely., so I think the TTS backend was generating the audio from the text, and once the audio became longer than 10mins, it threw an exception and closed the connection.

Azure Text to Speech API - Limited to 10 Minutes of Audio?

3 Answers3