0

I'm trying to run Text To Speech (AVSpeechSynthesizer) along with Speech To Text from Siri Kit, but I'm stuck with it.

My TTS works perfectly until I run the code to execute the STT, after that my TTS doesn't work anymore. I debugged the code and during the executing of the code, no errors happen, but my text is not transforming to speech. I think somehow my STT is disabling the output microphone and that's why the TTS doesn't transform the text to speech anymore, well, that's just a theory. Ops: My TTS stops working, but my STT works perfectly

Any tips?

Here's my viewController's code:

@IBOutlet weak var microphoneButton: UIButton!

//text to speech
let speechSynthesizer = AVSpeechSynthesizer()

//speech to text
private var speechRecognizer: SFSpeechRecognizer!

private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
private var recognitionTask: SFSpeechRecognitionTask?
private var audioEngine = AVAudioEngine()

@IBAction func textToSpeech(_ sender: Any) {

    if let word = wordTextField.text{

        if !speechSynthesizer.isSpeaking {


            //get current dictionary
            let dictionary = fetchSelectedDictionary()

            //get current language
            let language = languagesWithCodes[(dictionary?.language)!]

            let speechUtterance = AVSpeechUtterance(string: word)
                speechUtterance.voice = AVSpeechSynthesisVoice(language: language)
                speechUtterance.rate = 0.4
             //speechUtterance.pitchMultiplier = pitch
             //speechUtterance.volume = volume
                speechSynthesizer.speak(speechUtterance)

        }
        else{
            speechSynthesizer.continueSpeaking()
        }

    }
}

@IBAction func speechToText(_ sender: Any) {

    if audioEngine.isRunning {
        audioEngine.stop()
        recognitionRequest?.endAudio()
        microphoneButton.isEnabled = false
        microphoneButton.setTitle("Start Recording", for: .normal)
    } else {
        startRecording()
        microphoneButton.setTitle("Stop Recording", for: .normal)
    }

}

func startRecording() {

    if recognitionTask != nil {
        recognitionTask?.cancel()
        recognitionTask = nil
    }

    let audioSession = AVAudioSession.sharedInstance()
    do {
        try audioSession.setCategory(AVAudioSessionCategoryRecord)
        try audioSession.setMode(AVAudioSessionModeMeasurement)
        try audioSession.setActive(true, with: .notifyOthersOnDeactivation)
    } catch {
        print("audioSession properties weren't set because of an error.")
    }

    recognitionRequest = SFSpeechAudioBufferRecognitionRequest()

    guard let inputNode = audioEngine.inputNode else {
        fatalError("Audio engine has no input node")
    }

    guard let recognitionRequest = recognitionRequest else {
        fatalError("Unable to create an SFSpeechAudioBufferRecognitionRequest object")
    }

    recognitionRequest.shouldReportPartialResults = true

    recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: { (result, error) in

        var isFinal = false

        if result != nil {

            self.wordTextField.text = result?.bestTranscription.formattedString
            isFinal = (result?.isFinal)!
        }

        if error != nil || isFinal {
            self.audioEngine.stop()
            inputNode.removeTap(onBus: 0)

            self.recognitionRequest = nil
            self.recognitionTask = nil

            self.microphoneButton.isEnabled = true
        }
    })

    let recordingFormat = inputNode.outputFormat(forBus: 0)
    inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
        self.recognitionRequest?.append(buffer)
    }

    audioEngine.prepare()

    do {
        try audioEngine.start()
    } catch {
        print("audioEngine couldn't start because of an error.")
    }

    wordTextField.text = "Say something, I'm listening!"
}

}

Rafael Paz
  • 497
  • 7
  • 22

2 Answers2

1

Probably because your audiosession is in Record mode, You have 2 solutions, first would be to set your try audioSession.setCategory(AVAudioSessionCategoryRecord) to AVAudioSessionCategoryPlayAndRecord (This will work) but a cleaner way would be to get a separate function for saying something and then set your AVAudioSessionCategory to AVAudioSessionCategoryPlayback

Hope this helped.

Yann Massard
  • 283
  • 2
  • 16
  • brother, you just saved my life! Thank you so much, worked like a charm! – Rafael Paz Mar 14 '17 at 07:49
  • Brother, just something else.. Do you know how can I make my speaker of my iPhone get back to the normal volume? I mean, before I run the speech to text, the volume of my iPhone was high, but after I run the code, and fix the problem like you showed me, the volume of my text to speech is very low. Thank you so much for helping me out :) – Rafael Paz Mar 14 '17 at 08:03
  • yup try this do { try AVAudioSession.sharedInstance().overrideOutputAudioPort(AVAudioSessionPortOverride.speaker) } catch _ { } – Yann Massard Mar 14 '17 at 22:37
  • Didn't work out for me, brother, but that's alright, I'll make more research bout it. Anyway, thank you so much for your help! I really appretiate it – Rafael Paz Mar 16 '17 at 06:32
1

This line:

try audioSession.setMode(AVAudioSessionModeMeasurement)

is probably the reason. It can cause the volume to be throttled so low, that it sounds like it is off. Try:

try audioSession.setMode(AVAudioSessionModeDefault)

and see if it works.

coco
  • 2,998
  • 1
  • 35
  • 58
  • Sorry for replying this after a long time, but as I haven't been on StackOverflow I didn't see your comment, however, today I tried this solution and IT WORKED!!!! Thank you so much, mate! You saved my ass haha! I accepted this one as an answer! Thank you! – Rafael Paz Mar 12 '18 at 07:46