0

I am using the following Gradio sample code to transcribe my audio:

from transformers import pipeline
p = pipeline("automatic-speech-recognition")

import gradio as gr

def transcribe(audio):
    text = p(audio)["text"]
    return text

gr.Interface(
    fn=transcribe, 
    inputs=gr.Audio(source="microphone", type="filepath"), 
    outputs="text").launch()

However, the user has to start recording audio, stop recording audio, and the submit the audio. Can I auto submit the audio when the user presses stop recording audio?

Zing
  • 83
  • 1
  • 8

2 Answers2

2

I found the solution. I am putting it here for other's reference.

import gradio as gr

from transformers import pipeline

p = pipeline("automatic-speech-recognition")

def transcribe(audio):
    text = p(audio)["text"]
    return text

gr.Interface(
    fn=transcribe, 
    inputs=gr.Audio(source="microphone", type="filepath"), 
    outputs="text",live=True).launch()

Adding live=True serves the purpose.

Zing
  • 83
  • 1
  • 8
0

You can use auto-submit something like this should work

#auto submit after 5 seconds
gr.Interface(
    fn=transcribe,
    inputs=gr.Audio(source="microphone", type="filepath"),
    outputs="text",
    auto_submit=True,
    auto_submit_duration=5).launch()
twister_void
  • 1,377
  • 3
  • 13
  • 31