What is the Standard Audio Format Produced by synthesizeToFile in Android TextToSpeech?

Question

Using the synthesizeToFile method of Android TextToSpeech, how are we to know what file format (WAV, MP3, OGG), and/or attributes (sample rate, bit depth, etc.) the resulting file will be?

I can't find an explicit standard in the documentation... it doesn't even promise any particular file format such as WAV.

Is this simply up to the speech engine to implement however they choose?

What if we want to do something with the result, like calculate the duration of the file? We would have to know the details about the file format in advance. This is made even more unpredictable by the fact that there's no way to know what engine is installed/running on the end user's device.

Is there really no standard for this?

Maybe this can help - https://stackoverflow.com/questions/10487717/what-audio-formats-are-supported-by-tts. Checkout the comment made by Femi. — Sanjeev Pandey, Apr 09 '22 at 07:23

Tungken · Answer 1 · 2022-04-08T02:42:02.250

0

In android document about synthesizeToFile. You can see a suggestion format in filename parameter is .wav

And attributes of audio depend on your input source or you can setup it using Voice. You can get information of audio file after you save it successfully. Example: You can use MediaPlayer to get format, duration, bitrate...

You also can use AudioTrack to play raw data by reading audio buffer. AudioTrack is standard to play raw audio bytes

edited Apr 08 '22 at 02:42

answered Apr 05 '22 at 07:58

Tungken

1,917
1
15
18

Appreciate you taking the time, but I was aware of that suggested file path in the docs, but this doesnt mean wav is the standard, especially when the author prefaced it with the words “something like.” Standards are not established from an implication in a code comment. Also MediaPlayer isnt useful/usable in all cases. Imagine the base case of using raw audio bytes... thats why you need to know the standard in advance. – Nerdy Bunz Apr 05 '22 at 19:25
@NerdyBunz yes, there is no standard for the audio format. You can encode to wav, mp3...In case, using raw audio bytes without encoding, you can use AudioTrack to play raw data by reading audio buffer. AudioTrack is standard to play raw audio bytes. – Tungken Apr 06 '22 at 02:35
@NerdyBunz please mark answer if my answer help you. Thanks! – Tungken Apr 08 '22 at 02:43
Sorry it doesnt. – Nerdy Bunz Apr 09 '22 at 04:30

What is the Standard Audio Format Produced by synthesizeToFile in Android TextToSpeech?

1 Answers1