4

I would like to use DirectX 12 to load each frame of an H264 file into a texture and render it. There is however little to no information on doing this, and the Microsoft website has limited superficial documentation.

Media Foundation has plenty of examples and offers Hardware Enabled decoding. Is the Media Foundation a wrapper around DirectX or is it doing something else?

If not, how much less optimised would the Media Foundation equivalent be in comparison to a DX 12 approach?

Essentially, what are the big differences between Media Foundation and DirectX12 Video Decoding?

I am already using DirectX 12 in my engine so this is specifically regarding DX12.

Thanks in advance.

Roman R.
  • 68,205
  • 6
  • 94
  • 158
pma07pg
  • 516
  • 1
  • 4
  • 16
  • In the case of Media Foundation, what it's doing is using transforms (MFT), like a plugin if you will (encoder, decoder, converter, etc.). In general, it will automatically use the hardware-enabled transform provided with the graphics card driver (Amd, Intel, Nvidia basically). You can also use them explicitly (or also use Windows one - software). These transforms are in general themselves wrappers for MF, using internal vendor binaries. I've not used DirectX 12 video, and from the doc, it's not clear what it's using internally – Simon Mourier Nov 11 '20 at 07:56
  • Also in the MF case, you can usually connect the transform with DirectX, so everything can happen in hardware (RGB video -> nv12 -> h264): https://learn.microsoft.com/en-us/windows/win32/medfound/direct3d-aware-mfts and again, there's nothing here in that doc about DirectX12 – Simon Mourier Nov 11 '20 at 07:58
  • Thanks! Yeah that makes sense regarding the transforms. Unfortunately Surfaces don't exist as a concept in DirectX12, meaning the D3D Aware stuff described in the article can't be used. I think for the time being I'll just go for the Media Foundation Hardware transform. There's just not enough to go on for DX12 video really. – pma07pg Nov 11 '20 at 09:17

2 Answers2

4

Hardware video decoding comes from DXVA (DXVA2) API. It's DirectX 11 evolution is D3D11 Video Device part of D3D11 API. Microsoft provides wrappers over hardware accelerated decoders in the format of Media Foundation API primitives, such as H.264 Video Decoder. This decoder is offering use of hardware decoding capabilities as well as fallback to software decoding scenario.

Note that even though Media Foundation is available for UWP development, your options are limited and you are not offered primitives like mentioned transform directly. However if you use higher level APIs (Media Foundation Source Reader API in particular) you can leverage hardware accelerated video decoding in your UWP application.

Media Foundation implementation offers interoperability with Direct3D 11, in the part of video encoding/decoding in particular, but not Direct3D 12. You will not be able to use Media Foundation and DirectX 12 together out of the box. You will either have to implement Direct3D 11/12 interop to transfer the data between the APIs (or, where applicable, use shared access to the same GPU data).

Or alternatively you will have to step down to underlying ID3D12VideoDevice::CreateVideoDecoder which is further evolution of mentioned DXVA2 and Direct3D 11 video decoding APIs with similar usage.

Unfortunately if Media Foundation is notoriously known for poor documentation and hard-to-start development, Direct3D 12 video decoding has zero information and you will have to enjoy a feeling of a pioneer.

Either way all the mentioned are relatively thin wrappers over hardware assisted video decoding implementation with the same great performance. I would recommend taking Media Foundation path and implement 11/12 interop if/when it becomes necessary.

Roman R.
  • 68,205
  • 6
  • 94
  • 158
  • One interesting thing is the "Direct3D 12 Video APIs" chapter is under "Microsoft Media Foundation". I've not tested, but I think IMFDXGIDeviceManager::ResetDevice supports an ID3D12Device although it's not documented officially. – Simon Mourier Nov 11 '20 at 09:52
  • @SimonMourier I think it's unlikely. Support for D3D12 in Media Foundation will need to start with definition of `MF_SA_D3D12_AWARE` similarly to [`MF_SA_D3D11_AWARE`](https://learn.microsoft.com/en-us/windows/win32/medfound/mf-sa-d3d11-aware). We don't see it yet. – Roman R. Nov 11 '20 at 09:59
  • Maybe it just works with MF_SA_D3D_AWARE or MF_SA_D3D11_AWARE: https://i.imgur.com/M2a2DpK.png – Simon Mourier Nov 11 '20 at 10:16
  • MF_SA_D3D_AWARE is for D3D9 and is, mostly, obsolete. D3D12 should have its own explicit indication, especially that it costs nothing to get added. My assumption is that MF+D3D12 is nowhere near public availability. – Roman R. Nov 11 '20 at 10:20
0

You will get a lot of D3D12 errors caused by Media Foundation if you pass a D3D12 device to IMFDXGIDeviceManager::ResetDevice.

The errors could be avoided if you call IMFSourceReader::ReadSample slowly. It doesn't matter that you adopt sync or async mode to use this method. And, how slowly it should be depends on the machine that runs the program. I use ::Sleep(1) between ReadSample calls for sync mode playing a stream from network, and ::Sleep(3) for sync mode playing a local mp4 file on my machine.

Don't ask who I am. My name is 'the pioneer'.

Giovedi
  • 1
  • 1
  • Yes, you get these errors because you need to wait for the command allocator to finish processing commands before calling reset. With DX11 and earlier, this was done for you but with DX12, this power is in your hands! Calling sleep will only work most of the time and is definitely not a production solution. – pma07pg Dec 08 '20 at 14:44
  • Right, I see. Did you figure out how/where to wait for the command executions, since those command allocators are created by Media Foundation internally? – Giovedi Dec 08 '20 at 15:18
  • The command allocator/list stuff is part of the DX12 API for sending commands to the GPU. If you're only using the SourceReader with a DXGIManager then that is abstracted away and called behind the scenes. There is information on how to use DX12 Video Acceleration on the MSFT website but it's not really helpful. – pma07pg Dec 08 '20 at 15:30