Microsoft Translator Speech missing punctuation

Question

I am using MS Translator Speech WebSocket API for real-time speech recognition and translation. The problem is that sometimes the recognised text does not have punctuation (commas, full stops, etc.). The transcribed text looks good otherwise. I also receive an MP3 with synthesised translation.

It looks completely random, I can send the same audio multiple times and some responses have punctuation and some do not. I am sending the audio in correct format and in near real-time rate e.g. I send 100ms samples every ~100ms. The recognised language is Spanish.

Is this a common issue or is there some other catch?

score 1 · Accepted Answer · answered Sep 24 '18 at 12:09

1

Switching to the Speech Preview API solved the missing punctuation. For now there are SDK's only and the raw WebSocket API is not yet documented. I have managed to connect to and use the WS API, more info in another SO question.

answered Sep 24 '18 at 12:09

shelll

3,234
3
33
67

score 0 · Answer 2 · answered Sep 21 '18 at 21:34

0

There are different response types for partial recognitions and the final recognition. You receive partial recognitions as the speech continues to come in, and one final one at the end of the utterance. The partial results may be missing punctuation and casing, the final one will have casing and punctuation. If you want to ignore the responses without casing and punctuation, you want to filter to only see the final responses.

answered Sep 21 '18 at 21:34

Chris Wendt

539
3
6

I am receiving final recognitions without punctuation. – shelll Sep 24 '18 at 06:51

Microsoft Translator Speech missing punctuation

2 Answers2