0

I implemented an encoder in 2 ways.

1) based on the SDK Transcoder example, which uses topology and transcoding profile

2) based on IMFSourceReader and IMFSinkWriter, where the Sinkwriter delivers the samples to the Sourcewriter for transcoding

I tested both implementations on Windows 8.1 with Nvidia (Quadro K2200) and Intel GPU (P4600/P4700)

But bizarrly only the topology implementation uses GPU (on both).

In 2) I both I set "MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS", which has not to be set I guess, because 1) works with GPU with and without this flag set for the container type.

Is there a trick to enable GPU with IMFSinkWriter or is this a bug in the media foundation?

Passer
  • 51
  • 5
  • What are the formats involved (source / sink)? – Jeff May 04 '15 at 18:47
  • 1
    To keep it simple, I run my tests always with 2 test Szenarios: 1) PAL WMV -> 1920x1080 H264 Mp4 8 Mbit 2) 1920x1080 H264 Mp4 4 Mbit -> 1920x1080 H264 Mp4 8 Mbit At the moment a second problem appeared: While using the source->sink method a green line on the upper side appears and something seems not to be ok with some parts of the picture – Passer May 09 '15 at 05:51
  • 1
    The green line was a problem with setting the Decoder media Type to nv12 instead of YUY2. HW Decoding/Encoding still doesnt work – Passer May 19 '15 at 09:16

1 Answers1

4

I had initially ran into the same issue. You don't mention how you configured the output type of the source reader (and the input type of the sink), but I found that if you allow the system to handle it (by selecting the output type of the reader to be RGB32), the performance will be horrible and all CPU bound. (error checking omitted for brevity)

hr = videoMediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);
hr = videoMediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_RGB32);
hr = reader->SetCurrentMediaType((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, nullptr, videoMediaType);
reader->SetStreamSelection((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, true);

And the documentation agrees, indicating that this configuration is useful for getting a single snapshot from the video. As a result, if you configure the reader to deliver the native media type, performance is excellent, but you now have to transform the format yourself.

reader->GetNativeMediaType((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, videoMode->GetIndex(), videoMediaType);

From here, if you are dealing with simple color conversion (like YUY2 or YUV from a webcam) then there are a few options. I originally tried writing my own converter, and pushing that off to the GPU using HLSL with DirectCompute. This works very well, but in your case, the format isn't as trivial.

Ultimately, creating and configuring an instance of the color converter (as an IMFTransform) works perfectly.

Microsoft::WRL::ComPtr<IMFMediaType> mediaTransform;
hr = ::CoCreateInstance(CLSID_CColorConvertDMO, nullptr, CLSCTX_INPROC_SERVER, __uuidof(IMFTransform), reinterpret_cast<void**>(mediaTransform.GetAddressOf());

// set the input type of the transform to the NATIVE output type of the reader
hr = mediaTransform->SetInputType(0u, videoMediaType.Get(), 0u);

Create and configure a separate sample and buffer.

IMFSample* transformSample;
hr = ::MFCreateSample(&transformSample);
hr = ::MFCreateMemoryBuffer(RGB_MFT_OUTPUT_BUFFER_SIZE, &_transformBuffer);
hr = transformSample->AddBuffer(transformBuffer);

MFT_OUTPUT_DATA_BUFFER* transformDataBuffer;
transformDataBuffer = new MFT_OUTPUT_DATA_BUFFER();
transformDataBuffer->pSample = _transformSample;
transformDataBuffer->dwStreamID = 0u;
transformDataBuffer->dwStatus = 0u;
transformDataBuffer->pEvents = nullptr;

When receiving samples from the source, hand them off to the transform to be converted.

hr = mediaTransform->ProcessInput(0u, sample, 0u));
hr = mediaTransform->ProcessOutput(0u, 1u, transformDataBuffer, &outStatus));
hr = transformDataBuffer->pSample->GetBufferByIndex(0, &mediaBuffer);

Then of course, finally hand off the transformed sample to the sink just as you do today. I am confident that this will work, and you will be a very happy person. For me, I went from 20% CPU utilization (originally implementation) down to 2% (I am concurrently displaying the video). Good luck. I hope you enjoy your project.

Jeff
  • 2,495
  • 18
  • 38