1

I'm trying to playback MP3 (and similar audio files) using WASAPI shared mode and a media foundation IMFSourceReader on Windows 7. From what I understand I have to use an IMFTransform between the IMFSourceReader decoding and the WASAPI playback. Everything seems fine apart from when I call SetInputType()/SetOutputType() on the IMFTransform?

The relevant snippets of code are:

MFCreateSourceReaderFromURL(...);   //  Various test mp3 files
...

sourceReader->GetCurrentMediaType(MF_SOURCE_READER_FIRST_AUDIO_STREAM, &reader.audioType);
//sourceReader->GetNativeMediaType(MF_SOURCE_READER_FIRST_AUDIO_STREAM, 0, &reader.audioType);
...

audioClient->GetMixFormat(&player.mixFormat);
...

MFCreateMediaType(&player.audioType);
MFInitMediaTypeFromWaveFormatEx(player.audioType, player.mixFormat, sizeof(WAVEFORMATEX) + player.mixFormat->cbSize);
...




hr = CoCreateInstance(CLSID_CResamplerMediaObject, NULL, CLSCTX_INPROC_SERVER, IID_IUnknown, (void**)&unknown);
ASSERT(SUCCEEDED(hr));

hr = unknown->QueryInterface(IID_PPV_ARGS(&resampler.transform));
ASSERT(SUCCEEDED(hr));
unknown->Release();

hr = resampler.transform->SetInputType(0, inType, 0);
ASSERT(hr != DMO_E_INVALIDSTREAMINDEX);
ASSERT(hr != DMO_E_TYPE_NOT_ACCEPTED);
ASSERT(SUCCEEDED(hr));          //  Fails here with hr = 0xc00d36b4

hr = resampler.transform->SetOutputType(0, outType, 0);
ASSERT(hr != DMO_E_INVALIDSTREAMINDEX);
ASSERT(hr != DMO_E_TYPE_NOT_ACCEPTED);
ASSERT(SUCCEEDED(hr));          //  Fails here with hr = 0xc00d6d60

I suspect I am misunderstanding how to negotiate the input/output IMFMediaType's between things, and also how to take into consideration that IMFTransform needs to operate on uncompressed data?

It seems odd to me the output type fails but maybe that is a knock on effect of the input type failing first - and if I try to set the output type first it fails also.

Roman R.
  • 68,205
  • 6
  • 94
  • 158
iam
  • 1,623
  • 1
  • 14
  • 28

1 Answers1

2

In recent versions of Windows you would probably prefer to take advantage of stock functionality which is already there for you.

When you configure Source Reader object, IMFSourceReader::SetCurrentMediaType lets you specify media type you want your data in. If you set media type compatible with WASAPI requirements, Source Reader would automatically add transform to convert the data for you.

However...

Audio resampling support was added to the source reader with Windows 8. In versions of Windows prior to Windows 8, the source reader does not support audio resampling. If you need to resample the audio in versions of Windows earlier than Windows 8, you can use the Audio Resampler DSP.

... which means that indeed you might need to manage the MFT yourself. The input media type for the MFT is supposed to be coming from IMFSourceReader::GetCurrentMediaType. To instruct source reader to use uncompressed audio you need to build a media type decoder for this type of stream would decode audio to. For example, if your file is MP3 then you would read number of channels, sampling rate and build a compatible PCM media type (or take system decoder and ask it separately for output media type, which is even a cleaner way). You would set this uncompressed audio media type using IMFSourceReader::SetCurrentMediaType. This media type would also be your input media type for audio resampler MFT. This would instruct source reader to add necessary decoders and IMFSourceReader::ReadSample would give you converted data.

Output media type for reasmpler MFT would be derived from audio format you obtained from WASAPI and converted using API calls you mentioned at the top of your code snippet.

To look the error codes up you can use this:

Also, you, generally, should be able to play audio files using Media Foundation Media Session API with smaller effort. Media Session uses the same primitives to build a playback pipeline and takes care of format fitting.

Ah so are you saying I need to create an additional object that is the decoder to fit between the IMFSourceReader and IMFTransform/Resampler?

No. By doing SetCurrentMediaType with proper media type you have Source Reader adding decoder internally so that it could give you already decompressed data. Starting with Windows 8 it is also capable to do conversion between PCM flavors, but in Windows 7 you need to take care of this yourself with Audio Resampler DSP.

You can manage decoder yourself but you don't need to since Source Reader's decoder would do the same more reliably.

You might need a separate decoder just to help you guess what PCM media type decoder would produce so that you request it from Source Reader. MFTEnumEx is proper API to look decoder up.

I am not sure how to decide on or create a suitable decoder object? Do I need to enumerate a list of suitable ones somehow rather than assume specific ones?

The mentioned MFTEnum, MFTEnumEx API calls can enumerate decoders, both all available or filtered by given criteria.

One another way is to use partial media type (see relevant explanation and code snippet here: Tutorial: Decoding Audio). Partial media type is a signal about desired format requesting that Media Foundation API supplies a primitive that matches this partial type. See comments below for related discussion links.

Roman R.
  • 68,205
  • 6
  • 94
  • 158
  • Ah so are you saying I need to create an additional object that is the decoder to fit between the IMFSourceReader and IMFTransform/Resampler? I wanted to construct it at the lowest level so I could also get access to the uncompressed audio data as it streams through efficiently - which is why I didn't look at the Session API (Plus it is a little confusing what is possible where!). – iam Oct 29 '20 at 16:37
  • I am still confused after reading: https://learn.microsoft.com/en-us/windows/win32/medfound/configuringaudiodecoding. As I am not sure how to decide on or create a suitable decoder object? Do I need to enumerate a list of suitable ones somehow rather than assume specific ones? I can see an example in the docs with a specific CLSID_CWMV9EncMediaObject (but obviously I want to decode). I'm not sure how I would 'take system decoder and ask it separately for output media type'? – iam Oct 29 '20 at 16:48
  • I think this was the step I have been missing: https://learn.microsoft.com/en-us/windows/win32/api/mfapi/nf-mfapi-mftenumex, and maybe some info regards topology connections here: https://learn.microsoft.com/en-us/windows/win32/medfound/adding-a-decoder-to-a-topology – iam Oct 29 '20 at 17:09
  • This has proven quite helpful: https://www.codeproject.com/Articles/501521/How-to-convert-between-most-audio-formats-in-NET – iam Oct 30 '20 at 17:19
  • Using MFTEnumEx() I can find 1 decoder for MFAudioFormat_AAC to PCM/float but nothing for MP3 on my Win7 PC. Is there no default MP3 decoder for Win7? Meanwhile I will try and piece it together around an AAC test and report back to help form a more complete answer – iam Oct 30 '20 at 17:23
  • This gave me the specific answer to my problem: https://www.gamedev.net/articles/programming/general-and-gameplay-programming/decoding-audio-for-xaudio2-with-microsoft-media-foundation-r4280/ – iam Oct 31 '20 at 06:24
  • I needed on detection of a compressed media type in the native type of the source reader, to create a new partial mediaType that had a major/subtype of MFMediaType_Audio/MFAudioFormat_PCM, set that as the SetCurrentMediaType() on the source reader, then get the resultant concrete mediaType back from the source reader with GetCurrentMediaType(), which then gives one that works with SetInputType() on the resampler, and allows SetOutputType() to succeed with the WASAPI media type. That wasn't at all obvious that you could do that. – iam Oct 31 '20 at 06:34
  • If you want to incorporate that into your answer I will mark it as the accepted one. – iam Oct 31 '20 at 06:35
  • This shows an example of setting the partial type that I missed: https://learn.microsoft.com/en-us/windows/win32/medfound/tutorial--decoding-audio – iam Oct 31 '20 at 06:45
  • Partial media type is one of the ways (which I would not use). Windows 7 does have a discoverable decoder for `MFAudioFormat_MP3` and `MFTEnum` can pick it up. – Roman R. Oct 31 '20 at 07:48
  • Oh why wouldn't you use that way? No matter what I pass to MFTEnumEx() on Win7 I cannot get any decoder count for MFAudioFormat_MP3? – iam Oct 31 '20 at 07:56
  • I don't know - you did not show code. After all you get the decoder, so it exists and it has to be discoverable then. – Roman R. Oct 31 '20 at 08:06