How to find if espeak ended the speech?

Question

I want to use espeak in my program. I'd like to know when espeak stops speaking. Are there any flags or functions to check?

Let's consider this is my program:

Line 1
espeak
Line 2

When I execute this code, espeak starts to say "hello, this is espeak" but before it ends, Line 2 of code is executed, and I don't like this. I am looking for a way to pause the program until espeak ends the speaking!

EDIT: This is my complete code, I use pocketsphinx to recognize what the user say, then save it inside char* hyp and pass it through espeak by speech function.

static ps_decoder_t *ps;
static cmd_ln_t *config;
static FILE *rawfd;

espeak_POSITION_TYPE position_type;
espeak_AUDIO_OUTPUT output;
char *path=NULL;
int Buflength = 1000, Options=0;
void* user_data;
char Voice[] = {"English"};
char text2[30] = {"this is a english test"};
unsigned int Size,position=0, end_position=0, flags=espeakCHARS_AUTO, *unique_identifier;
t_espeak_callback *SynthCallback;
espeak_PARAMETER Parm;
//char* text;

static void initFuncs()
{

    output = AUDIO_OUTPUT_PLAYBACK;
    espeak_Initialize(output, Buflength, path, Options ); 
    espeak_SetVoiceByName(Voice);
    const char *langNativeString = "en";
    espeak_VOICE voice;
    memset(&voice, 0, sizeof(espeak_VOICE));
        voice.languages = langNativeString;
        voice.name = "US";
        voice.variant = 2;
        voice.gender = 1;
        espeak_SetVoiceByProperties(&voice);

}

static void sleep_msec(int32 ms)
{

    struct timeval tmo;

    tmo.tv_sec = 0;
    tmo.tv_usec = ms * 1000;

    select(0, NULL, NULL, NULL, &tmo);

}

static void speech(char* hyp)
{

    Size = strlen(hyp)+1;
    espeak_Synth( hyp, Size, position, position_type, end_position, flags,unique_identifier, user_data );
    espeak_Synchronize( );

}

static void recognize_from_microphone()
{
    ad_rec_t *ad;
    int16 adbuf[2048];
    uint8 utt_started, in_speech;
    int32 k;
    char  *hyp;

    if ((ad = ad_open_dev(cmd_ln_str_r(config, "-adcdev"),(int) cmd_ln_float32_r(config,"-samprate"))) == NULL)
        E_FATAL("Failed to open audio device\n");
    if (ad_start_rec(ad) < 0)
        E_FATAL("Failed to start recording\n");

    if (ps_start_utt(ps) < 0)
        E_FATAL("Failed to start utterance\n");

    utt_started = FALSE;
    E_INFO("Ready....\n");

    for (;;) {

        ad_start_rec(ad);

        if ((k = ad_read(ad, adbuf, 2048)) < 0)
            E_FATAL("Failed to read audio\n");
        ps_process_raw(ps, adbuf, k, FALSE, FALSE);
        in_speech = ps_get_in_speech(ps);
        if (in_speech && !utt_started) {
            utt_started = TRUE;
            E_INFO("Listening...\n");
        }
        if (!in_speech && utt_started) {

            ps_end_utt(ps);
            hyp = (char*)ps_get_hyp(ps, NULL );
            if (hyp != NULL) {

                ad_stop_rec(ad);
                speech(hyp);
                printf("%s\n", hyp); 
                fflush(stdout);
            }

            if (ps_start_utt(ps) < 0)
                E_FATAL("Failed to start utterance\n");
            utt_started = FALSE;
            E_INFO("Ready....\n");

        }

    }//for loop
    ad_close(ad);
}

int main(int argc, char *argv[])
{
    initFuncs();

    config = cmd_ln_init(NULL, ps_args(), TRUE,
                 "-hmm", MODELDIR "/en-us/en-us",
                     "-lm", MODELDIR "/en-us/en-us.lm.bin",
                     "-dict", MODELDIR "/en-us/cmudict-en-us.dict",
                     NULL);
    ps = ps_init(config);
        recognize_from_microphone();

    ps_free(ps);
    cmd_ln_free_r(config);

    return 0;
}

"The MSG_TERMINATED event is the last event. It can inform the calling program to clear the user data related to the message. So if the synthesis must be stopped, the callback function is called for each pending message with the MSG_TERMINATED event. " from http://espeak.sourceforge.net/speak_lib.h — Thomas Sablik, Apr 27 '18 at 12:34
@ThomasSablik: I editet my question and added the full code. — Hasani, Apr 27 '18 at 14:32
I tried to use ` if( espeak_EVENT_TYPE == espeakEVENT_MSG_TERMINATED)` line in my code but it gives me error — Hasani, Apr 27 '18 at 14:44
error: expected primary-expression before ‘==’ token if( espeak_EVENT_TYPE == espeakEVENT_MSG_TERMINATED) — Hasani, Apr 27 '18 at 15:59
Can you show me how you tried to use this snippet in your code — Thomas Sablik, Apr 28 '18 at 11:28
Your `SynthCallback` is not set. This is your callback funtion. You have to define a callback function and to register it with `ESPEAK_API void espeak_SetSynthCallback(t_espeak_callback* SynthCallback);` before any synthesis functions are called. You really should read the documentation. — Thomas Sablik, Apr 28 '18 at 11:37

score 1 · Answer 1 · answered Apr 28 '18 at 13:00

I adapted the espeak part of your code. In this code espeak is finished before Line 2 begins. Also the callback functionality is implemented. You are setting a voice by name and a voice by property. Maybe this is a problem. You are working with c-style strings and not with std::string. Maybe you are calculating the wrong string length. I don't know where the problem in your code is but the following code has fixed it:

#include <string>
#include <iostream>
#include <espeak/speak_lib.h>

espeak_POSITION_TYPE position_type(POS_CHARACTER);
espeak_AUDIO_OUTPUT output(AUDIO_OUTPUT_PLAYBACK);
void* user_data;
std::string voice("English");
std::string text("this is a english test");
unsigned int Size(0);
unsigned int position(0);
unsigned int end_position(0);
unsigned int flags(espeakCHARS_AUTO);
unsigned int* unique_identifier;

static void initFuncs() {
  espeak_Initialize(output, 0, 0, 0);
  espeak_SetVoiceByName(voice.c_str());
}

int SynthCallback(short *wav, int numsamples, espeak_EVENT *events) {
  std::cout << "Callback: ";
  for (unsigned int i(0); events[i].type != espeakEVENT_LIST_TERMINATED; i++) {
    if (i != 0) {
      std::cout << ", ";
    }
    switch (events[i].type) {
      case espeakEVENT_LIST_TERMINATED:
        std::cout << "espeakEVENT_LIST_TERMINATED";
        break;
      case espeakEVENT_WORD:
        std::cout << "espeakEVENT_WORD";
        break;
      case espeakEVENT_SENTENCE:
        std::cout << "espeakEVENT_SENTENCE";
        break;
      case espeakEVENT_MARK:
        std::cout << "espeakEVENT_MARK";
        break;
      case espeakEVENT_PLAY:
        std::cout << "espeakEVENT_PLAY";
        break;
      case espeakEVENT_END:
        std::cout << "espeakEVENT_END";
        break;
      case espeakEVENT_MSG_TERMINATED:
        std::cout << "espeakEVENT_MSG_TERMINATED";
        break;
      case espeakEVENT_PHONEME:
        std::cout << "espeakEVENT_PHONEME";
        break;
      case espeakEVENT_SAMPLERATE:
        std::cout << "espeakEVENT_SAMPLERATE";
        break;
      default:
        break;
    }
  }
  std::cout << std::endl;
  return 0;
}

static void speech(std::string hyp) {
    Size = hyp.length();
    espeak_SetSynthCallback(SynthCallback);
    espeak_Synth(hyp.c_str(), Size, position, position_type, end_position, flags,unique_identifier, user_data );
    espeak_Synchronize( );
}

int main() {
  initFuncs();
  std::cout << "Start" << std::endl;
  speech(text.c_str());
  std::cout << "End" << std::endl;
  return 0;
}

The out put is

Start
Callback: espeakEVENT_SENTENCE
Callback: espeakEVENT_WORD
Callback: espeakEVENT_WORD
Callback: espeakEVENT_WORD
Callback: espeakEVENT_WORD
Callback: espeakEVENT_WORD
Callback: espeakEVENT_END
Callback: espeakEVENT_MSG_TERMINATED
End

The timing of the console outout fits to the audio output. When you are working with C++, then you should use its tools and features like strings, cout instead of printf and smart pointers to avoid problems like this.

Sorry, I had no internet access till now, I want to read your answer now. Thank you for that and hope it works. — Hasani, Apr 29 '18 at 11:28
I tested your code, but I think there is a problem. The sequence of my interesting things is like this : 1.pocketsphinx recognizes a word and stores it within `hyp`. 2.I pass the `hyp` to `espeak` by `speech()` function. 3. `espeak` reads the `hyp` and the rest of the code after `speech()` function is on pause. But when I must call `espeak_SetSynthCallback()` before the `espeak_Synth` I can not reach my purpose in step 3, and the program will not be paused till espeak ends speaking! — Hasani, Apr 29 '18 at 20:19
With `espeak_SetSynthCallback()` you set the callback function for `espeak_Synth`. The rest of your code is started by the callback function when the event is `espeakEVENT_MSG_TERMINATED`. This means, that espeak finished speaking and you can continue your programm. — Thomas Sablik, Apr 30 '18 at 07:01

How to find if espeak ended the speech?

1 Answers1

Linked