7

Hello, I've been looking for a way to play and record audio on a Linux (preferably Ubuntu) system. I'm currently working on a front-end to a voice recognition toolkit that'll automate a few steps required to adapt a voice model for PocketSphinx and Julius.

Suggestions of alternative means of audio input/output are welcome, as well as a fix to the bug shown below.

Here is the current code I've used so far to play a .WAV file:

void Engine::sayText ( const string OutputText ) {
    string audioUri = "temp.wav";
    string requestUri = this->getRequestUri( OPENMARY_PROCESS , OutputText.c_str( ) );
    int error , audioStream;
    pa_simple *pulseConnection;
    pa_sample_spec simpleSpecs;
    simpleSpecs.format = PA_SAMPLE_S16LE;
    simpleSpecs.rate = 44100;
    simpleSpecs.channels = 2;

    eprintf( E_MESSAGE , "Generating audio for '%s' from '%s'..." , OutputText.c_str( ) , requestUri.c_str( ) );
    FILE* audio = this->getHttpFile( requestUri , audioUri );
    fclose(audio);
    eprintf( E_MESSAGE , "Generated audio.");

    if ( ( audioStream = open( audioUri.c_str( ) , O_RDONLY ) ) < 0 ) {
        fprintf( stderr , __FILE__": open() failed: %s\n" , strerror( errno ) );
        goto finish;
    }

    if ( dup2( audioStream , STDIN_FILENO ) < 0 ) {
        fprintf( stderr , __FILE__": dup2() failed: %s\n" , strerror( errno ) );
        goto finish;
    }

    close( audioStream );

    pulseConnection = pa_simple_new( NULL , "AudioPush" , PA_STREAM_PLAYBACK , NULL , "openMary C++" , &simpleSpecs , NULL , NULL , &error );

    for (int i = 0;;i++ ) {
        const int bufferSize = 1024;
        uint8_t audioBuffer[bufferSize];
        ssize_t r;
        eprintf( E_MESSAGE , "Buffering %d..",i);
        /* Read some data ... */
        if ( ( r = read( STDIN_FILENO , audioBuffer , sizeof (audioBuffer ) ) ) <= 0 ) {
            if ( r == 0 ) /* EOF */
                break;

            eprintf( E_ERROR , __FILE__": read() failed: %s\n" , strerror( errno ) );
    if ( pulseConnection )
        pa_simple_free( pulseConnection );

        }

        /* ... and play it */
        if ( pa_simple_write( pulseConnection , audioBuffer , ( size_t ) r , &error ) < 0 ) {
            fprintf( stderr , __FILE__": pa_simple_write() failed: %s\n" , pa_strerror( error ) );
    if ( pulseConnection )
        pa_simple_free( pulseConnection );

        }

        usleep(2);

    }
    /* Make sure that every single sample was played */
    if ( pa_simple_drain( pulseConnection , &error ) < 0 ) {
        fprintf( stderr , __FILE__": pa_simple_drain() failed: %s\n" , pa_strerror( error ) );
    if ( pulseConnection )
        pa_simple_free( pulseConnection );
    }    
}

NOTE: If you want the rest of the code to this file, you can download it here directly from Launchpad.

Update: I tried using GStreamermm, and this won't work:

    Glib::RefPtr<Pipeline> pipeline;
    Glib::RefPtr<Element> sink, filter, source;
    Glib::RefPtr<Gio::File> audioSrc = Gio::File::create_for_path(uri);

    pipeline = Pipeline::create("audio-playback");
    source = ElementFactory::create_element("alsasrc","source");
    filter = ElementFactory::create_element("identity","filter");
    sink = ElementFactory::create_element("alsasink","sink");
    //sink->get_property("file",audioSrc);
    if (!source || !filter || !sink){
        showErrorDialog("Houston!","We got a problem.");
        return;
    }
    pipeline->add(source)->add(filter)->add(sink);
    source->link(sink);

    pipeline->set_state(Gst::STATE_PLAYING);
    showInformation("Close this to stop recording");
    pipeline->set_state(Gst::STATE_PAUSED);
Brent Bradburn
  • 51,587
  • 17
  • 154
  • 173
jackyalcine
  • 469
  • 1
  • 8
  • 21
  • This would be a better question for [stackoverflow](http://stackoverflow.com/). – Michael K Feb 08 '11 at 14:07
  • 1
    I'd consider the gstreamer library for playing and recording. But I suppose that Pulse should also have a recording options? – Petriborg Feb 08 '11 at 14:17
  • @Michael I posted it on the programming one, because I assumed it was programming related. @Petriborg Can you demonstrate a means of doing it or a **good** link on how to do so? GStreamer seems to only support OGG and CMUSphinx needs WAV files. – jackyalcine Feb 08 '11 at 14:27
  • Where is the bug? what do you get when you try to compile? What do you want in an alternative? – Tom Feb 08 '11 at 15:13
  • @Tom it's not a programming bug; it's more like missing code. When it's compiled and run (you can try by heading to the Launchpad, downloading and running), instead of hearing 'Welcome to Speech', you get PEEEP! And I want something that works :) – jackyalcine Feb 08 '11 at 19:15

2 Answers2

4

The "Hello World" application in the GStreamer documentation shows how to play an Ogg/Vorbis file. To make this work with WAV files, you can simply replace "oggdemux" with "wavparse" and replace "vorbisdec" with "identity" (the identity plugin does nothing -- it's just a placeholder).

To install development support for GStreamer (on Ubuntu)...

sudo apt-get install libgstreamer0.10-dev

You need the following on the gcc command-line to enable the use of GStreamer libraries...

$(pkg-config --cflags --libs gstreamer-0.10)

By the way, you may find it useful to use "gst-launch" for prototyping GStreamer pipelines before writing the code.

## recording
gst-launch-0.10 autoaudiosrc ! wavenc ! filesink location=temp.wav

## playback
gst-launch-0.10 filesrc location=temp.wav ! wavparse ! autoaudiosink

A feature of GStreamer that may be useful for voice recognition is that it is easy to insert audio quality filters into a pipeline -- so you could, for example, reduce noise that might otherwise be in the recording. A pointer to a list of the GStreamer "good" plugins is here.

Also of interest, "PocketSphinx" (which seems to be related to your project) already has some GStreamer integration. See Using PocketSphinx with GStreamer and Python

Brent Bradburn
  • 51,587
  • 17
  • 154
  • 173
  • I haven't used Python for this application; it's mainly code in C++, as it's the language I know best, but you've been a tremendous help and I'll reply with my status. – jackyalcine Feb 08 '11 at 19:17
  • Also, I'm attempting to implement a super simple way of adapting voice models for PocketSphinx/CMUSphinx. – jackyalcine Feb 09 '11 at 04:48
1

GStreamer/Pulse/JACK are great. For simple and fast things you might use SoX http://sox.sourceforge.net/

Daniel Voina
  • 3,185
  • 1
  • 27
  • 32
  • Do they have an API? I can't use the command line; I need to be able to control volume as well. – jackyalcine Feb 28 '11 at 21:55
  • 1
    @Jacky Alcine: libsox is a library of sound sample file format readers/writers and sound effects processors. It is mainly developed for use by SoX but is useful for any sound application. Yes, SoX can be used as a lib too - just llok in the archive for details – Daniel Voina Mar 01 '11 at 15:17