Realistic text to speech with Python that doesn't require internet?

Question

I'm trying to create an artificially intelligent program (nothing really big or special) and I wanted it to have a voice (who wouldn't?). I've looked into espeak, festival, gTTS and they're nice and usable, but not realistic enough for me to really be proud of, if that makes sense. I've been looking for something more realistic. Like this

from gtts import gTTS

tts = gTTS(text='what to say', lang='en')
tts.save('/path/to/file.mp3')

gTTS works fine. I love it. It's realistic, but it requires Internet.. The issue is, I want my application to be as independent as possible. And I hate depending on Internet access.

Are there any other options?

PS: I'm currently running Linux, so your OS might have a different solution.

If speech recognition would be so easy, it wouldn't need internet. — user1767754, Jan 25 '18 at 08:38
A simple google search yielded this: https://pypi.python.org/pypi/pyttsx3/2.5 — Isma, Jan 25 '18 at 08:40
That, my friend, is the answer I was dreading. The future can't come fast enough :( — Edgecase, Jan 25 '18 at 08:40
I'll try that, Isma. If it works out, I'll answer my own question. Thanks :) — Edgecase, Jan 25 '18 at 08:41
Unfortunately, Isma, pyttsx3 sounds like an exact replica of espeak. Which would be nice for porting it over to another computer, but not for realism. Thanks anyways — Edgecase, Jan 25 '18 at 08:45
I added an answer just comment under it so others know it is not what you are looking for, also try to make it clear in your question, otherwise is difficult to answer ;-) — Isma, Jan 25 '18 at 08:46
For those who don't mind targeting Mac OS only (with its builtin TTS engine), you can check: [Python Text to Speech in Macintosh](https://stackoverflow.com/q/12758591/45249). Actually, pyttsx uses it when run on Mac OS. — mouviciel, Jan 25 '18 at 09:03
That's a good point. Let me edit my question. I'm using Linux — Edgecase, Jan 25 '18 at 09:05
If you are so keen on getting a high-quality offline voice generator, you can train a CNN-RNN sound generator. That would require a dataset, but you can obtain lots of them on the web. — Eli Korvigo, Jan 25 '18 at 09:08
Guess I didn't get the notification for that last comment, so I'm only just now reading it. Thanks, Eli. I'll look into that. PS: Still haven't found a realistic text to speech program that doesn't require internet — Edgecase, Dec 24 '18 at 17:57
i also was not successful to find a fast solution for Linux. maybe this is a compromise: https://pypi.org/project/TTS/ — SL5net, Nov 13 '22 at 11:23

Isma · Accepted Answer · 2023-07-21T07:03:16.087

28

Try to use pyttsx3 2.5, according the documentation:

gTTS which works perfectly in python3 but it needs internet connection to work since it relies on google to get the audio data.But Pyttsx is completely offline and works seemlesly and has multiple tts-engine support.

Works for Python 2 and 3

To install it:

pip install pyttsx3

Using it should be as simple as:

import pyttsx3
engine = pyttsx3.init()
engine.say("I will speak this text")
engine.runAndWait()

Edit 1 - Changing the voice

To get a less robotic voice you can try to change the voice as follows:

engine.setProperty('voice', voice.id)

To get the available voices

voices = engine.getProperty('voices')

You can try the different available voices as explained in this question: Changing the voice with PYTTSX module in python.

Edit 2 - Selecting speech engine

The library supports the following engines:

sapi5 - SAPI5 on Windows
nsss - NSSpeechSynthesizer on Mac OS X
espeak - eSpeak on every other platform

If espeak is not very natural you can try sapi5 if you are on Windows or nsss if you are on Mac OS X.

You can specify the engine in the init method, e.g.:

pyttsx3.init(driverName='sapi5')

More info here: http://pyttsx3.readthedocs.io/en/latest/engine.html

edited Jul 21 '23 at 07:03

answered Jan 25 '18 at 08:45

Isma

14,604
5
37
51

I tested this and it works, but it sounds like an exact replica of espeak. It would be great for getting the voice to work on another computer that doesn't have espeak installed, but not for sounding realistic. I'm trying to find a voice that doesn't sound robotic. Or at least THAT robotic – Edgecase Jan 25 '18 at 08:48
Did you try changing the voice? I edited my answer. Unfortunately there seems not to be too many alternatives in Python to do that. If you can move to c#, there are other solutions. – Isma Jan 25 '18 at 08:51
I didn't even know you could change the voices with that module. Let me try it out – Edgecase Jan 25 '18 at 08:51
Just tested it. And it doesn't work as expected. Instead of changing the quality of the voice, or even the gender, it changes the accent of the voice. It's a little humerus, some of them. But since it's not any less robotic, I won't mark this answer solved in the hopes that someone has a solution I'm looking for. – Edgecase Jan 25 '18 at 08:59
I added a new edit, try to change the engine to get a more realistic result. – Isma Jan 25 '18 at 09:05
2

The sapi5 engine would work if I was running Windows, but I'm running Linux unfortunately. – Edgecase Jan 25 '18 at 09:12
If there's not a Linux solution for a less robotic voice, I'll probably just stick with espeak for the moment and mark this answer as the solution. – Edgecase Jan 25 '18 at 09:15
Then you need espeak, I guess... – Isma Jan 25 '18 at 09:15
Espeak it is. It was worth a shot. Thanks for the help :) – Edgecase Jan 25 '18 at 09:16
How do you save an audio file of the speech with pyttsx3? – joejoejoejoe4 Sep 24 '18 at 05:54
I'm not 100% sure about that, but espeak uses the same exact voice. If you're running Linux, you could use the command "espeak -w filename.wav 'what to say'" – Edgecase Dec 24 '18 at 17:55
But you would have to install espeak first, of course. "sudo apt-get install espeak" – Edgecase Dec 24 '18 at 17:55
How can I add a new voice? b/c it gives me only 3 options b/c I want to add some new voices – Lidor Eliyahu Shelef Nov 30 '19 at 20:54
Is `NSSpeechSynthesizer` the same as the one used by the VoiceOver functionality in macos? – xuiqzy Oct 24 '20 at 13:06
@LidorEliyahuShelef To add new voice try https://puneet166.medium.com/how-to-added-more-speakers-and-voices-in-pyttsx3-offline-text-to-speech-812c83d14c13 – Awal Nov 16 '22 at 04:20
Did anyone get any idea how to make voice less robotic or more like human? – Awal Nov 16 '22 at 04:23
If espeak is not very natural you can try sapi5 if you are on Windows or nsss if you are on Mac OS X. You can specify the engine in the init method. – Isma Nov 16 '22 at 10:14
You can try https://pypi.org/project/edge-tts/ which offers natural voices! – ferluis May 27 '23 at 00:00
i did not know that you could use `;` in python :) – jp_ Jul 20 '23 at 06:55

Realistic text to speech with Python that doesn't require internet?

1 Answers1

Linked