How to setup streamingRecognize Google Cloud Speech To Text V2 in Node.js?

Question

I am trying to setup the streamingRecognize() Google Cloud Speech to Text V2 in Node.js for streaming audio data and it always throws me the same Error upon the initial recognizer request to setup the stream:

Error: 3 INVALID_ARGUMENT: Invalid resource field value in the request.
    at callErrorFromStatus (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/call.ts:81:17)
    at Object.onReceiveStatus (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/client.ts:701:51)
    at Object.onReceiveStatus (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/client-interceptors.ts:416:48)
    at /Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/resolving-call.ts:111:24
    at processTicksAndRejections (node:internal/process/task_queues:77:11)
for call at
    at ServiceClientImpl.makeBidiStreamRequest (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/client.ts:685:42)
    at ServiceClientImpl.<anonymous> (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/make-client.ts:189:15)
    at /Users/<filtered>/backend/node_modules/@google-cloud/speech/build/src/v2/speech_client.js:318:29
    at /Users/<filtered>/backend/node_modules/google-gax/src/streamingCalls/streamingApiCaller.ts:71:19
    at /Users/<filtered>/backend/node_modules/google-gax/src/normalCalls/timeout.ts:54:13
    at StreamProxy.setStream (/Users/<filtered>/backend/node_modules/google-gax/src/streamingCalls/streaming.ts:204:20)
    at StreamingApiCaller.call (/Users/<filtered>/backend/node_modules/google-gax/src/streamingCalls/streamingApiCaller.ts:88:12)
    at /Users/<filtered>/backend/node_modules/google-gax/src/createApiCall.ts:118:26
    at processTicksAndRejections (node:internal/process/task_queues:95:5)

{
  code: 3,
  details: 'Invalid resource field value in the request.',
  metadata: Metadata {
    internalRepr: Map(2) {
      'google.rpc.errorinfo-bin' => [Array],
      'grpc-status-details-bin' => [Array]
    },
    options: {}
  },
  statusDetails: [
    ErrorInfo {
      metadata: [Object],
      reason: 'RESOURCE_PROJECT_INVALID',
      domain: 'googleapis.com'
    }
  ],
  reason: 'RESOURCE_PROJECT_INVALID',
  domain: 'googleapis.com',
  errorInfoMetadata: {
    service: 'speech.googleapis.com',
    method: 'google.cloud.speech.v2.Speech.StreamingRecognize'
  }
}

The stream setup process has two steps 1. sending the recognizer request object to tell google what recognizer to use (consisting of the path to the recognizer object as string and an optional config object to overwrite certain options of the recognizer) for the following audio data in bytes and 2. The same request with no config but an audio Buffer for the audio to be transcribed.

I did not get to sending the audio data since the initial recognizer request always failed.

Would be great if someone could help me with this issue since it seems to be rather simple one which might be super obvious if you know where the issue originates from.

My guesses where I made a mistake:

I misconfigured something in Google Cloud, but this does not seem too plausible since everything else worked except the streaming requests.
I build the request object wrong. If this is the case, please also provide the request object for sending the audio buffer.

I have read through the Google Cloud Speech to Text V2 docs and tried to implement everything as described. In the end it should return transcribed audio.

Setup a recognizer in the Google Cloud console.
Checked if all necessary APIs where enabled.
Checked if the service account etc. has the correct permissions for authentication etc.
Checked if authentication works correctly.

I also tried several times to implement streamingRecognize() as follows and with some slight variations:

public async initialize() {
    
    const recognizerName = `projects/${this.projectId}/locations/global/recognizers/_`;
    const transcriptionRequest = {
      recognizer: recognizerName,
      streaming_config: streamingConfig,
    };

    const stream = this.client
      .streamingRecognize()
      .on("data", function (response) {
        console.log(response);
      })
      .on("error", function (response) {
        console.log(response);
      });

    // Write request objects.
    stream.write(transcriptionRequest);
  }

I have also tried to use several recognizer_ids instead of "_" in recognizerName. I have tried several different types of transcriptionRequests where I omitted the streaming_config or renamed it to just "config". I have triple checked my projectId which I have also exchanged for the project number instead of the project-id (found on the main page of the google cloud console). Nothing worked and I always receive the same Error.

Besides that I have also tried to make a normale createRecognizer and recognize request using v2 like this which worked fine:

 // Creates a Recognizer: WORKS
  public async createRecognizer() {
    const recognizerRequest = {
      parent: `projects/${this.projectId}/locations/global`,
      recognizerId: "rclatest",
      recognizer: {
        languageCodes: ["en-US"],
        model: "telephony",
      },
    };

    const operation = await this.client.createRecognizer(recognizerRequest);
    const recognizer = operation[0].result;
    const recognizerName = recognizer; //.name;
    console.log(`Created new recognizer: ${recognizerName}`);
  }

  // Transcribes Audio: WORKS
  public async transcribeFile() {
    const recognizerName = `projects/${this.projectId}/locations/global/recognizers/${this.recognizerId}`;
    const content = fs.readFileSync(this.audioFilePath).toString("base64");
    const transcriptionRequest = {
      recognizer: recognizerName,
      config: {
        // Automatically detects audio encoding
        autoDecodingConfig: {},
      },
      content: content,
    };

    const response = await this.client.recognize(transcriptionRequest);
    for (const result of response[0].results) {
      console.log(`Transcript: ${result.alternatives[0].transcript}`);
    }
  }

How to setup streamingRecognize Google Cloud Speech To Text V2 in Node.js?

0 Answers0