0

Hi
I'm trying to create a "Speech to text" app that can transcribe any audio/video file. I've created an app based on this post and it works great for WAV files. But if I use an MP3 file, the line hr = cpInputStream->BindToFile(wInputFileName.c_str(), SPFM_OPEN_READONLY, &sInputFormat.FormatId(), sInputFormat.WaveFormatExPtr(), SPFEI_ALL_EVENTS); returns

The Parameter is incorrect

The question is, can I use MP3 files as input for SAPI? and if yes, how do I determine the correct format for the call to hr = sInputFormat.AssignFormat(SPSF_16kHz16BitStereo) because SPSF_16kHz16BitStereo will certainly not be correct and I don't think we should hardcode it.

Sam
  • 2,473
  • 3
  • 18
  • 29
  • 2
    So you can use [MediaFoundation](https://learn.microsoft.com/en-us/windows/win32/medfound/transcode-sample) to transcode into WAV and then pass the `IStream` from a transcode into the [`ISpStream`](https://learn.microsoft.com/en-us/previous-versions/windows/desktop/ms719492(v=vs.85)), [Transcode sample](https://github.com/Microsoft/Windows-classic-samples/tree/main/Samples/Win7Samples/multimedia/mediafoundation/Transcode) – Mgetz Mar 01 '22 at 18:23
  • @Mgetz I've been looking at this code sample and every other samples that is in that git and I still cannot figure out how Transcode into a `ISpStream` or any `IStream` descendant for that matter. Every example writes the WAV into a file which in my case I'm not allowed to. – Sam Mar 08 '22 at 18:30
  • So you may have to create an `IStream` sink on your own, I thought the examples for MF had a version of that. You can create a memory backed `IStream` using [`SHCreateMemStream`](https://learn.microsoft.com/en-us/windows/win32/api/shlwapi/nf-shlwapi-shcreatememstream). – Mgetz Mar 08 '22 at 18:41
  • or you can get an `IStream` from a media foundation byte stream directly after transcode using [`MFCreateStreamOnMFByteStream`](https://learn.microsoft.com/en-us/windows/win32/api/mfidl/nf-mfidl-mfcreatestreamonmfbytestream) and use [`MFCreateWAVEMediaSink`](https://learn.microsoft.com/en-us/windows/win32/api/mfidl/nf-mfidl-mfcreatewavemediasink). But there are ways to do this. – Mgetz Mar 08 '22 at 18:55

0 Answers0