3

This is a followon to this question about using the DX11VideoRenderer sample (a replacement for EVR that uses DirectX11 instead of EVR's DirectX9).

I've been trying to track down why it uses so much more CPU than the EVR. Task Manager shows me that most of that time is kernel mode.

Using profiling tools, I see that a LOT of time is being spent in numerous calls to NtDelayExecution (aka Sleep). How many calls? ~100,000 over the course of ~12 seconds. Ok, yeah, I'm sending a lot of frames in those 12 seconds, but that's still a lot of calls, every one of which requires a kernel mode transition.

The callstack shows the last call in "my" code is to IDXGISwapChain1::Present(0, 0). The actual call seems be Sleep(0) and comes from nvwgf2umx.dll (which is why this question is tagged NVidia: hopefully someone there can call up the code and see what the logic is behind such frequent calls).

I couldn't quite figure out why it would need to do /any/ Sleeping during Present. It's not like we wait for vertical retrace anymore, is it? But the other reason to use Sleep has to do with yielding to other threads. Which led me to a serious clue:

If I use D3D11_CREATE_DEVICE_PREVENT_INTERNAL_THREADING_OPTIMIZATIONS, the CPU utilization drops. Along with some other fixes, the DX11 version is now faster and uses less CPU time than the DX9 version (which is what I would hope/expect). Profiling shows that Sleep has dropped from >30% to <1%.

Unfortunately, this page tells me:

This flag is not recommended for general use.

Oh.

So, any ideas on how to get decent performance without using debug flags?

Acorn
  • 24,970
  • 5
  • 40
  • 69
David Wohlferd
  • 7,110
  • 2
  • 29
  • 56
  • As you using D3D11CreateDevice with D3D_DRIVER_TYPE_HARDWARE? Have you tried D3D_DRIVER_TYPE_WARP? Also have you tried to vary settings (XVP/DComp)? Or can you test with different hardware? Maybe a driver issue. you can also enabled debug layer see if there are some messages. – Simon Mourier Nov 06 '20 at 14:57
  • @SimonMourier - All good suggestions. Yes, D3D_DRIVER_TYPE_HARDWARE. I tried changing to WARP, but the code seems based around HARDWARE (ie IDXGIAdapter::EnumOutputs won't work and no ID3D11VideoDevice). I tried adding the D3D11_CREATE_DEVICE_DEBUG flag. While it ran, it's not producing -any- (additional) output in the VS output window. Maybe I'm doing something wrong? The default behavior of the code is noDComp. Changing to DComp has no effect. There is a `useXVP` option in the code. Toggling this has no effect. I'll keep trying. I may post to nvidia.com too. – David Wohlferd Nov 06 '20 at 22:16
  • Sleep(0)'s very ineffective at yielding to other threads, especially if you are presenting from the application's main thread. It's usually Above Normal priority and 0 is a special case that does not yield to anything but other Above Normal priority threads. – Kaldaien Jan 29 '23 at 19:24
  • @Kaldaien I know. But the call to Sleep is NOT in my code. It's somewhere in the guts of nvwgf2umx.dll, which is provided by NVidia. Hopefully sometime in the last 2+ years they've fixed this, but I haven't checked lately. – David Wohlferd Jan 30 '23 at 07:55

0 Answers0