0

I'm programming a personal assistant with python on windows, it works so bad, some times I have the error 13 - permission denied to the file where the voice is stored. Another times it directly doesn't recognize my voice and other times it expend one minute or more for recognizing my voice. Looking at the code, what things I should improve to make it works better ?

import os
import time
import playsound
import speech_recognition as sr
from gtts import gTTS


def speak(text):
    tts = gTTS(text=text, lang="es-ES")
    filename = "voice.mp3"
    tts.save(filename)
    playsound.playsound(filename)


def get_audio():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        audio = r.listen(source)
        said = ""

        try:
            said = r.recognize_google(audio, language="es-ES")
            print(said)
        except Exception as e:
            print("Exception: " + str(e))

    return said

speak("Di algo")
get_audio()

1 Answers1

0

Welcome Samuel_05 I am also new around here!

To begin with instead of using a file to store aggregate data from gTTS, instead we can use a io.BytesIO (known as Bytes IO, Buffered IO, Virtual IO, and the list goes on...) object to store tts data pulled from Google in memory. Unfortunately, the playsound module used in your code doesn't support audio streaming via a file-like object. An alternative can be pygame it supports mp3 playback via a file-like object. Using a file-like object should solve your permission denied error.

Code

from pygame import mixer
import speech_recognition as sr
from gtts import gTTS
from io import BytesIO


# Adapted from:
# https://github.com/pndurette/gTTS/issues/26#issuecomment-607573170
def speak(text):
    with BytesIO() as f:
        tts = gTTS(text=text, lang="es-ES")
        tts.write_to_fp(f)  # Write speech to f
        f.seek(0)  # seek to zero after writing
        mixer.music.load(f)
        mixer.music.play()
        while mixer.music.get_busy():
            continue


def get_audio():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        audio = r.listen(source)
        said = ""

        try:
            said = r.recognize_google(audio, language="es-ES")
            print(said)
        except Exception as e:
            print("Exception: " + str(e))

    return said

mixer.init()
speak("Di algo")