My python .CSV to MP3 text-to-speech script is not passing text to the API correctly - how can I fix this?

Question

So I've been working on a CSV to MP3 generator to create audio resources for learning French verbs. The script takes data from a csv spreadsheet and uses gTTS to generate french audio for the verbs, and english audio for the translations. This is then concatenated into an mp3 file which includes the main verb tenses.

It had been working but now it's giving me errors "No text to send to TTS API". I printed the text strings in the terminal and the script is able to see the text from the CSV file.

I'd appreciate any suggestions. The script is below. Here is a pastebin of my csv file which contains the verb data. It must be encoded as ANSI and not UTF-8 to for the TTS to read letters with accents: https://pastebin.com/nDuKHQCM

from gtts import gTTS
from pydub import AudioSegment
import csv
import os

def generate_audio(text, language):
    if not text:
        return AudioSegment.silent(duration=1000)
    tts = gTTS(text=text, lang=language)
    tts.save("temp.mp3")
    return AudioSegment.from_mp3("temp.mp3")

verb_tenses = []
with open("C:\\Users\\user\\Desktop\\faire.csv") as csvfile:
    reader = csv.reader(csvfile)
    for row in reader:
        verb_tenses.append(row)

file_name = os.path.splitext(os.path.basename(csvfile.name))[0]
file_name_audio = generate_audio(file_name, "fr")
final_audio = file_name_audio + AudioSegment.silent(duration=2000)

for tense in verb_tenses:
    if len(tense) == 1:
        tense_audio = generate_audio(tense[0], "fr")
        final_audio = final_audio + AudioSegment.silent(duration=1000) + tense_audio + AudioSegment.silent(duration=1000)
    else:
        for i in range(0, len(tense), 2):
            verb = tense[i]
            translation = tense[i + 1]
            verb_audio = generate_audio(verb, "fr")
            translation_audio = generate_audio(translation, "en")
            final_audio = final_audio + verb_audio + translation_audio

final_audio.export("verb_tenses.mp3", format="mp3")

I printed the text strings in the terminal, which it did without any issues. Does the free version of gTTS have a limit on characters. There is mention of a workaround here but I don't understand how best to apply that for my use case: https://stackoverflow.com/a/71868861/19089358

The goal of this script is to generate a full reading of the verbs in the CSV file and their respective translations. Each language is read in the respective google TTS voice - Fr and Eng.

As an added bonus, it would be nice to modify the script to process separate mp3s for a folder of csv files for each verb.

My python .CSV to MP3 text-to-speech script is not passing text to the API correctly - how can I fix this?

0 Answers0