-1

I want to send text from my client (Angular v.12) to the backend through REST API so I'll get the audio back, then in the client use it with new Audio(...) and be able to play the sound on user click.

My backend looks like this:

const express = require("express");
const cors = require("cors");
const textToSpeech = require('@google-cloud/text-to-speech');
const stream = require("stream");
const app = express();

app.get('/api/tts', async (req, res) => {
  const txt = req.query.txt
  console.log('txt', txt);
  const client = new textToSpeech.TextToSpeechClient();
  const request = {
    input: {text: txt},
    voice: {languageCode: 'en-US', ssmlGender: 'NEUTRAL'},
    audioConfig: {audioEncoding: 'MP3'},
  };
  const [response] = await client.synthesizeSpeech(request);
  const readStream = new stream.PassThrough();

  readStream.end(response.audioContent);
  res.set("Content-disposition", 'attachment; filename=' + 'audio.mp3');
  res.set("Content-Type", "audio/mpeg");

  readStream.pipe(res);
})

Now in my client I just created a button to test, and on click I send an HTTP request like so:

  public textToSpeech(txt: string) {
    let httpParams: HttpParams = new HttpParams()
      .set('txt', txt)
    return this.http.get('//localhost:3030/api/tts', { params: httpParams, responseType: 'text' })

  }

I do get a 200 OK code and a long string as a response.

In my component:

  onButtonClick() {
this.speechService.textToSpeech('testing')
.subscribe(res => {
  this.audio = new Audio(res)
  this.audio.play()
})

}

but I get the following errors:

GET http://localhost:4200/��D�

Uncaught (in promise) DOMException: The media resource indicated by the src attribute or assigned media provider object was not suitable.
Dominik
  • 6,078
  • 8
  • 37
  • 61
Yoni Segev
  • 31
  • 1
  • 5

1 Answers1

0

Okay, so I solved it with a different approach. On the backend, I use fs to write and create an MP3 file to the public folder, and then on the frontend, I put the link to the file as the source like so:

Backend:

app.get('/api/tts', async (req, res) => {
  const {text} = req.query
  const client = new textToSpeech.TextToSpeechClient();
  const request = {
    input: {text},
    voice: {languageCode: 'en-US', ssmlGender: 'FEMALE'},
    audioConfig: {audioEncoding: 'MP3'},
  };
  const [response] = await client.synthesizeSpeech(request);
  const writeFile = util.promisify(fs.writeFile);
  await writeFile(`./public/audio/${text}.mp3`, response.audioContent, 'binary');
  res.end()
})

Frontend:

  onButtonClick() {
    this.speechService.textToSpeech('hello')
      .subscribe(res => {
        
        this.audio = new Audio(`//localhost:3030/audio/hello.mp3`)
        this.audio.play()
      })
  }

It's hardcoded right now, but I'm going to make it dynamic, just wanted to test.

I don't know if this is the best approach but I got it to work the way I wanted.

Yoni Segev
  • 31
  • 1
  • 5
  • what about if we have to convert without mp3 file, i.e. convert voice to text, as user speak then it will start typing on the input box – Deep Kakkar Jul 29 '22 at 08:55