2

I am trying to use Direct Line Speech (DLS) in my custom voice app. The Voice app has access to the real-time audio streams which I want to (pcm encoded) it directly to Direct Line Speech that allows a back and forth communication in real-time.

From the DLS Client sample code (https://github.com/Azure-Samples/Cognitive-Services-Direct-Line-Speech-Client), I see that the method ListenOneAsync() in Microsoft.CognitiveServices.Speech.Dialog.DialogServiceConnector namespace, but looks like it's capturing media from local microphone.

But looking at the reply here (Is new ms botbuilder directline speech good fit for call center scenario?), it seems I can send the audio stream to the DLS directly. I can't seem to find any documentation around this. Can someone shed some light on how to achieve this?

bedtym
  • 35
  • 6
  • Are you hoping to send audio from a pre-recorded file? – Kyle Delaney Oct 03 '19 at 22:56
  • No, I am trying to connect it to a streaming real time endpoint – bedtym Oct 05 '19 at 15:16
  • @KyleDelaney - I just came across [this question](https://stackoverflow.com/questions/53209752/speech-recognition-with-microsoft-cognitive-speech-api-and-non-microphone-real-t) which is similar to my problem except that I have a real time endpoint and I need continuous back and forth with the Direct Line Speech client. – bedtym Oct 05 '19 at 15:50
  • So to be clear, you have a web app that has web sockets for both incoming and outgoing audio streams and you're using it to forward audio from some other source to Direct Line Speech? – Kyle Delaney Oct 07 '19 at 21:52
  • yes, that's correct. – bedtym Oct 08 '19 at 06:17

1 Answers1

2

I believe your answer lies in the Microsoft.CognitiveServices.Speech.Audio.AudioConfig class. Have a look at this line in the Direct Line Speech client:

this.connector = new DialogServiceConnector(config, AudioConfig.FromDefaultMicrophoneInput());

AudioConfig provides many options besides FromDefaultMicrophoneInput. I suspect you'll want to use one of the three FromStreamInput overloads. If you do that then ListenOnceAsync will use your stream instead of the microphone.

Kyle Delaney
  • 11,616
  • 6
  • 39
  • 66
  • Thanks Kyle! I'm trying this out. – bedtym Oct 08 '19 at 06:18
  • @bedtym - Don't forget to upvote and accept the answer if it helps you – Kyle Delaney Oct 08 '19 at 16:32
  • Thanks Kyle, I think this does give me some result, but I can't seem to understand how to hook up Direct Line speech with with continuous 'streaming' real-time media. – bedtym Oct 09 '19 at 00:24
  • 1
    I've posted a [new question](https://stackoverflow.com/questions/58295798/how-to-hook-real-time-audio-stream-endpoint-to-direct-line-speech-endpoint) which is continuation of this question - hope you can help me out. – bedtym Oct 09 '19 at 01:11