Siri Kit (Speech to text) disabling my TTS (Text to speech) iOS

Question

I'm trying to run Text To Speech (AVSpeechSynthesizer) along with Speech To Text from Siri Kit, but I'm stuck with it.

My TTS works perfectly until I run the code to execute the STT, after that my TTS doesn't work anymore. I debugged the code and during the executing of the code, no errors happen, but my text is not transforming to speech. I think somehow my STT is disabling the output microphone and that's why the TTS doesn't transform the text to speech anymore, well, that's just a theory. Ops: My TTS stops working, but my STT works perfectly

Any tips?

Here's my viewController's code:

@IBOutlet weak var microphoneButton: UIButton!

//text to speech
let speechSynthesizer = AVSpeechSynthesizer()

//speech to text
private var speechRecognizer: SFSpeechRecognizer!

private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
private var recognitionTask: SFSpeechRecognitionTask?
private var audioEngine = AVAudioEngine()

@IBAction func textToSpeech(_ sender: Any) {

    if let word = wordTextField.text{

        if !speechSynthesizer.isSpeaking {


            //get current dictionary
            let dictionary = fetchSelectedDictionary()

            //get current language
            let language = languagesWithCodes[(dictionary?.language)!]

            let speechUtterance = AVSpeechUtterance(string: word)
                speechUtterance.voice = AVSpeechSynthesisVoice(language: language)
                speechUtterance.rate = 0.4
             //speechUtterance.pitchMultiplier = pitch
             //speechUtterance.volume = volume
                speechSynthesizer.speak(speechUtterance)

        }
        else{
            speechSynthesizer.continueSpeaking()
        }

    }
}

@IBAction func speechToText(_ sender: Any) {

    if audioEngine.isRunning {
        audioEngine.stop()
        recognitionRequest?.endAudio()
        microphoneButton.isEnabled = false
        microphoneButton.setTitle("Start Recording", for: .normal)
    } else {
        startRecording()
        microphoneButton.setTitle("Stop Recording", for: .normal)
    }

}

func startRecording() {

    if recognitionTask != nil {
        recognitionTask?.cancel()
        recognitionTask = nil
    }

    let audioSession = AVAudioSession.sharedInstance()
    do {
        try audioSession.setCategory(AVAudioSessionCategoryRecord)
        try audioSession.setMode(AVAudioSessionModeMeasurement)
        try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
    } catch {
        print("audioSession properties weren't set because of an error.")
    }

    recognitionRequest = SFSpeechAudioBufferRecognitionRequest()

    guard let inputNode = audioEngine.inputNode else {
        fatalError("Audio engine has no input node")
    }

    guard let recognitionRequest = recognitionRequest else {
        fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
    }

    recognitionRequest.shouldReportPartialResults = true

    recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in

        var isFinal = false

        if result != nil {

            self.wordTextField.text = result?.bestTranscription.formattedString
            isFinal = (result?.isFinal)!
        }

        if error != nil || isFinal {
            self.audioEngine.stop()
            inputNode.removeTap(onBus: 0)

            self.recognitionRequest = nil
            self.recognitionTask = nil

            self.microphoneButton.isEnabled = true
        }
    })

    let recordingFormat = inputNode.outputFormat(forBus: 0)
    inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
        self.recognitionRequest?.append(buffer)
    }

    audioEngine.prepare()

    do {
        try audioEngine.start()
    } catch {
        print("audioEngine couldn't start because of an error.")
    }

    wordTextField.text = "Say something, I'm listening!"
}

}

score 1 · Answer 1 · answered Mar 13 '17 at 10:02

1

Probably because your audiosession is in Record mode, You have 2 solutions, first would be to set your try audioSession.setCategory(AVAudioSessionCategoryRecord) to AVAudioSessionCategoryPlayAndRecord (This will work) but a cleaner way would be to get a separate function for saying something and then set your AVAudioSessionCategory to AVAudioSessionCategoryPlayback

Hope this helped.

answered Mar 13 '17 at 10:02

Yann Massard

283
2
16

brother, you just saved my life! Thank you so much, worked like a charm! – Rafael Paz Mar 14 '17 at 07:49
Brother, just something else.. Do you know how can I make my speaker of my iPhone get back to the normal volume? I mean, before I run the speech to text, the volume of my iPhone was high, but after I run the code, and fix the problem like you showed me, the volume of my text to speech is very low. Thank you so much for helping me out :) – Rafael Paz Mar 14 '17 at 08:03
yup try this do { try AVAudioSession.sharedInstance().overrideOutputAudioPort(AVAudioSessionPortOverride.speaker) } catch _ { } – Yann Massard Mar 14 '17 at 22:37
Didn't work out for me, brother, but that's alright, I'll make more research bout it. Anyway, thank you so much for your help! I really appretiate it – Rafael Paz Mar 16 '17 at 06:32

score 1 · Accepted Answer · answered Feb 03 '18 at 03:16

1

This line:

try audioSession.setMode(AVAudioSessionModeMeasurement)

is probably the reason. It can cause the volume to be throttled so low, that it sounds like it is off. Try:

try audioSession.setMode(AVAudioSessionModeDefault)

and see if it works.

answered Feb 03 '18 at 03:16

coco

2,998
1
35
58

Sorry for replying this after a long time, but as I haven't been on StackOverflow I didn't see your comment, however, today I tried this solution and IT WORKED!!!! Thank you so much, mate! You saved my ass haha! I accepted this one as an answer! Thank you! – Rafael Paz Mar 12 '18 at 07:46

Siri Kit (Speech to text) disabling my TTS (Text to speech) iOS

2 Answers2