Using gcloud speech api for real-time speech recognition in dart, flutter

Question

I want to use Google's real-time speech recognition api in a flutter project, written in dart. I've activated a gcloud account, created the api key (which should be the only necessary authentication method for google speech) and written a basic apk which ought to send an audio stream to Google cloud and display the response. I imported the googleapis/speech and googleapis_auth plugin.

But I couldn't figure out how to set it up. They say you have to use gRPC, which makes sense as it should make it easy to use, but the implementation of their plugin on github doesn't seem to use it.

So can anyone tell me how to use it - setting up authentication and transcribing a speech?

DazWilkin · Answer 1 · 2019-04-03T21:50:37.350

Update:

Here's a working sample:

https://gist.github.com/DazWilkin/34d628b998b4266be818ffb3efd688aa

You need only plug the values of a service account key.json and should receive:

{
    alternatives: [{
        confidence: 0.9835046,
        transcript: how old is the Brooklyn Bridge
    }]
}

It is poorly documented :-(

I'm familiar with Google API development but unfamiliar with Dart and with the Google Speech-to-Text API so, apologies in advance.

See: https://github.com/dart-lang/googleapis/tree/master/generated/googleapis

There are 2 flavors of Google SDK|library, the more common (API Client Libraries) and the new (Cloud [!] Client Libraries). IIUC, for Dart for Speech you're going to use the API Client Library and this doesn't use gRPC.

I'm going to tweak the sample by gut, so bear with me:

import 'package:googleapis/speech/v1.dart';
import 'package:googleapis_auth/auth_io.dart';

final _credentials = new ServiceAccountCredentials.fromJson(r'''
{
  "private_key_id": ...,
  "private_key": ...,
  "client_email": ...,
  "client_id": ...,
  "type": "service_account"
}
''');

const _SCOPES = const [SpeechApi.CloudPlatformScope];

void main() {
  clientViaServiceAccount(_credentials, _SCOPES).then((http_client) {
    var speech = new SpeechApi(http_client);
    speech...
  });
}

This requires the creation of a service account with appropriate permissions and a (JSON) key generated for it. Generally, the key file is loaded by the code but, in this example, it's provided as a string literal. The key will provide the content for fromJson. You ought (!) to be able to use Application Default Credentials for testing (easier) see the link below.

Somehow (!) the Dart API will include a method|function that makes this underlying REST call. The call expects some configuration and the audio:

https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize

I suspect it's this recognize and it expects a RecognizeRequest

Sorry I can't be of more help.

If you do get it working, please consider publishing the same so others may benefit.

NB

Okay, if they are not using gRPC for it then it makes a lot more sense. Thank you so far for your effort and quick answer! I will play around with it and if I get it to work I will post some working code here — anarchy, Apr 04 '19 at 13:11
It seems to me Google is just not providing streaming support in that package... I wonder why, but there is nothing to _streaming_ in the plugin. — anarchy, Apr 05 '19 at 08:44
IIUC streaming requires gRPC (as you mentioned) and this (API Client) Library uses REST which doesn't support it. Google needs to publish a Library for this service for Dart using gRPC|http2. I'll ask the Dart team. — DazWilkin, Apr 05 '19 at 14:24

score 4 · Answer 2 · answered May 15 '20 at 15:23

4

For all who are still interested in the topic. I have released a Flutter package that supports Google's Speech-to-Text Api via grpc. This also allows the use of streamingRecognize.

You can find it here: https://pub.dev/packages/google_speech

answered May 15 '20 at 15:23

Felix Junghans

152
1
12

It's great, I looked your package, how can I use microphone for real-time transcription? In my application, people making video call and I am trying to show subtitles to users in real-time. And I am using agora sdk for video call. Is it possible to add live transcription to my application? Thank you – Omer Ciftci May 18 '20 at 00:42
Sorry for the late response. I haven't worked with direct microphone input myself, but it should be possible. Using a Flutter PlatformChannel you could send the microphone output as UInt8List to Flutter and then send it to Google via streamingRecognize. Without having a closer look at the package, I think that e.g. https://pub.dev/packages/mic_stream could solve this problem. – Felix Junghans Jun 05 '20 at 13:08
This is almost exactly what I needed, but is there anyway to auto recognise the language? Instead of giving a confirmed language in the beginning. – Nithin Sai Jul 07 '20 at 05:32
@FelixJunghans i tried this solution but not working could you see this question and give a hint ? https://stackoverflow.com/questions/73051997/flutter-google-speach-response-stream-not-working – ialyzaafan Jul 21 '22 at 03:30

Using gcloud speech api for real-time speech recognition in dart, flutter

2 Answers2

Linked