I would like to be able to test which text-to-speech voices are available for my iOS app to use with AVSpeechSynthesis. It is easy to generate a list of the installed voices, but Apple makes some of them are off-limits for use by apps, and I would like to know which.
For example, consider the following test code (swift 5.1):
import AVFoundation
...
func voiceTest() {
let speechSynthesizer = AVSpeechSynthesizer()
let voices = AVSpeechSynthesisVoice.speechVoices()
for voice in voices where voice.language == "en-US" {
print("\(voice.language) - \(voice.name) - \(voice.quality.rawValue) [\(voice.identifier)]")
let phrase = "The voice you're now listening to is the one called \(voice.name)."
let utterance = AVSpeechUtterance(string: phrase)
utterance.voice = voice
speechSynthesizer.speak(utterance)
}
}
When I call voiceTest()
, the console output is this:
en-US - Nicky (Enhanced) - 2 [com.apple.ttsbundle.siri_female_en-US_premium]
en-US - Aaron - 1 [com.apple.ttsbundle.siri_male_en-US_compact]
en-US - Fred - 1 [com.apple.speech.synthesis.voice.Fred]
en-US - Nicky - 1 [com.apple.ttsbundle.siri_female_en-US_compact]
en-US - Samantha - 1 [com.apple.ttsbundle.Samantha-compact]
en-US - Alex - 2 [com.apple.speech.voice.Alex]
Some of the voices speak in their actual voice, whereas some of them speak in the default voice instead. In my case both Nicky (com.apple.ttsbundle.siri_female_en-US_premium) and Alex (com.apple.speech.voice.Alex) are listed as high quality but sound instead like the low quality default, Samantha, when selected.
I know that Apple has said that the Siri voices are not available for use in third party apps. When I manually load Samantha (High Quality) on my iPhone via Settings, it appears in the list and I can use it. Perhaps Alex is just the high-quality male Siri voice, even though Aaron would seem to be the low-quality Siri voice based on its identifier (com.apple.ttsbundle.siri_male_en-US_compact)? And that's why Alex and Nicky are the only two to be unavailable? So that if I have my app specifically exclude those it will generate the true list of available voices? It would be nice to have some clarity.