0

Google Cloud Speech-to-Text and Amazon Transcribe both offer punctuation and word timestamps. Can I get punctuation timestamps? Specifically, I want timestamps for sentence breaks (periods, question marks, exclamation points), e.g., at 0:33 seconds, 1:01, 1:23, 1:49, 2:05, etc.

I suppose that I could use Google or AWS to transcribe a file with punctuation, then break the transcript up into sentences, and then do a word timestamp for each sentence. It would be easier (and about 1/500 of the computer time, for a file with 500 sentences) if I could just set a parameter for getPunctuationTimestamps.

IBM Watson Speech-to-Text offers keyword spotting but not punctuation.

Thomas David Kehoe
  • 10,040
  • 14
  • 61
  • 100

1 Answers1

0

I tried IBM Watson Speech-to-Text's keyword spotting. I provided a 48-minute Radio Ambulante podcast, which has an official transcript. This is for a high-quality NPR podcast. I selected one sentence from the transcript as the "keyword" to spot. It took about 48 minutes for Watson to transcribe the 48-minute podcast. The host was transcribed at better than 90%, and the interviewees were better than 80%. The problem was that to work I'd need 100% accuracy. For example, a Cuban doctor says "Yo me consideraba, no comunista" ("I didn't consider myself to be a communist") but Watson heard "consideraba comĂșn esto". Watson never found the target sentence.

Thomas David Kehoe
  • 10,040
  • 14
  • 61
  • 100