2

I'm trying to capture the desktop frames using Desktop Duplication API, and encode them right away with NvPipe without going through CPU access to pixels.

Is there any way to have the ID3D11Texture2D data as an input for NvPipe, or some other efficient way of doing it? I'm working on a VR solution that requires as low latency as possible, so even a 1ms saved is a big deal.

Edit: After following recommendations from @Soonts, I've ended up with this code which doesn't seem to work:

    cudaArray *array;
    m_DeviceContext->CopySubresourceRegion(CopyBuffer, 0, 0, 0, 0, m_SharedSurf, 0, Box);
    cudaError_t err = cudaGraphicsD3D11RegisterResource(&_cudaResource, CopyBuffer, cudaGraphicsRegisterFlagsNone);
    err = cudaGraphicsResourceSetMapFlags(_cudaResource, cudaGraphicsMapFlagsReadOnly);
    cudaStream_t cuda_stream;
    cudaStreamCreate(&cuda_stream);
    err = cudaGraphicsMapResources(1, &_cudaResource, cuda_stream);
    err = cudaGraphicsSubResourceGetMappedArray(&array, _cudaResource, 0, 0);
    uint64_t compressedSize = NvPipe_Encode(encoder, array, dataPitch, buffer.data(), buffer.size(), width, height, false);

The NvPipe_Encode results in a memory access violation and does nothing. I don't know which step I'm messing up, as I can't seem to find any documentation about any of the functions or variables/structures online, and putting watches on variables shows no valuable information other than their address in memory.

Hey'Youssef
  • 285
  • 4
  • 15

1 Answers1

2

I have not tried but I think should be doable with CUDA interop.

Call cudaGraphicsD3D11RegisterResource to register a texture for CUDA interop. Not sure you can register the DWM-owned texture you get from DD. If you can't, make another one (in a default pool), register that one, and update each frame with ID3D11DeviceContext::CopyResource.

Call cudaGraphicsResourceSetMapFlags to specify you want read-only access from CUDA side of the interop.

Call cudaGraphicsMapResources to allow CUDA to access the texture.

Call cudaGraphicsSubResourceGetMappedArray. Now you have a device pointer with the frame data you can give to NvPipe.

P.S. Other option is using media foundation instead of NvPipe. Will work on all GPUs not just nVidia, on most systems MF also uses hardware encoders. I’m not sure about latency, never used MF for anything too sensitive to it, and never used NvPipe at all, have no idea how they compare.

Soonts
  • 20,079
  • 9
  • 57
  • 130
  • That's very good information, trying it right away. Seems like it will do the trick, I'll try the Media Foundation as well, the "works on all GPUs" seems attractive. Will update you on the progress. Thanks! – Hey'Youssef May 30 '19 at 16:30
  • 1
    @Hey'Youssef The FPS of the stream is tricky to get right, regardless on the encoding API. DD doesn’t give you 60Hz frames, it only gives new frames once Windows renders anything. One approach is keeping the old frame in VRAM, and try to submit frames to encoder at fixed rate. – Soonts May 30 '19 at 16:54
  • I edited my question with the progress, if you could take a look at it I'll be extremely grateful. – Hey'Youssef May 31 '19 at 21:44