17

Do google Chrome extensions support Chrome's Web Speech speech recognition API? I have included some javascript to create a speech recognition object, but when I launch my extension, I am not prompted for microphone access.

This is not an issue with my code. I have searched on google, but I can't find any information on whether Chrome extensions support the Web Speech API. I just want a yes/no answer.

Note: I believe the WebSpeech API won't work for local files.

Kara
  • 6,115
  • 16
  • 50
  • 57
Michael Zhao
  • 183
  • 1
  • 5

2 Answers2

16

The Web Speech API can already be used by Chrome extensions, even in the background page and extension button popups. The fact that it works is not necessarily an intended feature, and I have previously explained how it works and why it works in this answer to How to use webrtc insde google chrome extension?. The previous explanation is about WebRTC, but it applies equally to Web Speech, and can be used as follows:

  1. Instantiate a webkitSpeechRecognition instance and start recording.
  2. If a permission error is detected (onerror being triggered with event.error === 'not-allowed'), open an extension page (chrome-extension://[ID]/yourpage.html). This extension page can be opened in a new window, tab or iframe.
  3. From this page, request access to the microphone. getUserMedia and SpeechRecognition both share the (persistent) audio permission, so to detect whether audio recording is allowed, you could use getUserMedia to request the permission without activating speech recognition. For instance:

    navigator.webkitGetUserMedia({
        audio: true,
    }, function(stream) {
        stream.stop();
        // Now you know that you have audio permission. Do whatever you want...
    }, function() {
        // Aw. No permission (or no microphone available).
    });
    
Community
  • 1
  • 1
Rob W
  • 341,306
  • 83
  • 791
  • 678
12

Update: Based on RobW's answer, this answer is now out of date, and the Web Speech API is now usable inside of extensions. (Unfortunately, I can't delete this answer unless the OP un-accepts it.)


The answer is not yet. Pages accessed through chrome-extension: URLs cannot access any media-input APIs, including speechRecognition and getUserMedia. Any attempt to the use APIs will immediately trigger an error callback.

I originally thought speechRecognition could work like the geolocation API: extension popups cannot prompt for geolocation permission, but chrome-extension: pages loaded as full browser pages can prompt for permission just like a normal page. However, media APIs do not behave this way; they fail regardless of whether the page is a popup or a full page.

There is a bug report to fix this and allow developers to specify media-access permissions in the manifest. When this bug is fixed, extensions can have a manifest-set permission that grants them automatic microphone/video access, so the inability to prompt for permission will become a non-issue (and therefore extensions with appropriate manifest permissions will be able to freely use the Speech API).

Community
  • 1
  • 1
apsillers
  • 112,806
  • 17
  • 235
  • 239
  • Thanks for the response, but I'm looking specifically for information on Chrome's native Web Speech API. If I don't get any other responses, then I'll accept yours! – Michael Zhao Jul 30 '13 at 20:18
  • @MichaelZhao Could you clarify in what ways my answer could be more specific about Chrome's speech API? That's exactly what my answer talks about (i.e., the [`webkitSpeechRecognition` API](http://updates.html5rocks.com/2013/01/Voice-Driven-Web-Apps-Introduction-to-the-Web-Speech-API); specification [here](https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html)). Is there some other aspect of the API you'd like me to address? Or have I misunderstood exactly which API you mean? – apsillers Jul 30 '13 at 20:33
  • I'm under the impression that the bug report refers to another JS speech API. Extremely helpful nevertheless. – Michael Zhao Jul 31 '13 at 08:21
  • @MichaelZhao For confirmation: 1) There's [another bug report](https://code.google.com/p/chromium/issues/detail?id=171311), marked "Duplicate" of the one linked in my answer, that expressly refers to the "Web Speech API" and links to this thread whose OP is trying to use `webkitSpeechRecognition` in his code. 2) The bug I link to has "[Implement Web Speech API](https://code.google.com/p/chromium/issues/detail?id=116954)" listed as a bug that it is blocking; that blocked bug links to the [`SpeechRecognition` spec](http://www.w3.org/2005/Incubator/htmlspeech/2010/10/google-api-draft.html). – apsillers Jul 31 '13 at 13:23
  • @apsillers I'm trying to confirm the part regarding the `chrome-extension://`. I have it loaded in there and *everything* works up until the line `recognition.start()`, at which point I am not prompted for mic access. Any thoughts? – temporary_user_name Aug 29 '13 at 07:10
  • @Aerovistae While my explanation holds true for the `geolocation` API (a `chrome-extension://` page presented standalone can prompt, but as popup it cannot), it seems that I was wrong about this for speech APIs. Both `getUserMedia` and `speechRecognition` immediately fire error callbacks from extension URLs, even though the UI is capable of showing permission prompts in general (as observed with geolocation). Sorry for the misinformation; I'm updating my answer now. – apsillers Aug 29 '13 at 14:58
  • @MichaelZhao If possible, please un-accept this answer, as I think it is now incorrect. – apsillers Jun 15 '15 at 13:47