1

I'm using Desktop Duplication to copy the contents of the screen to a bitmap in memory. I receive desktop texture, then create a staging texture, use CopyResource to copy the Desktop Texture into the Staging Texture, finally call ID3D11DeviceContext::Map to access the staging texture bits and copy them. Pretty much the same way as described here: https://stackoverflow.com/a/27283837/825318

The problem is that Map call takes a lot of time - for large display resolutions such as 4K it can take up to a 100ms per call, which is unacceptably high as I need to ensure 30 fps speed.

Is there any way to get the content of the texture faster? If not, is there any way to supply my own mapping address pointer so the system copies the texture data there? Thanks

Community
  • 1
  • 1
Isso
  • 1,285
  • 11
  • 23
  • 1
    The [ID3D11DeviceContext::Map](https://msdn.microsoft.com/en-us/library/windows/desktop/ff476457(v=vs.85).aspx) documentation talks about performance penalty and suggests the appropriate strategies and the end of the page. One of them is to use the volatile pointer. –  Nov 25 '16 at 16:12
  • Thanks, however that applies only to write-only surfaces, while my task is to read data from the surface. – Isso Nov 25 '16 at 16:23

2 Answers2

5

Most probably the Map operation is waiting on the CopyResource to complete. You can use GPUView to verify this. If this is the case then the recommended solution is to accept the latency in favor of framerate. The way to handle this is to keep at least 3 staging textures and rotate between them in the following manner:

frame#1 - start copy resource to staging texture #1

frame#2 - start copy resource to staging texture #2

frame#3 - start copy resource to staging texture #3 and map staging texture #1 to access data

frame#4 - start copy resource to staging texture #1 (old contents should already be saved) and map staging texture #2 to access data

This way you can keep 30 FPS but introduce ~130ms latency which is acceptable for most applications.

moradin
  • 305
  • 2
  • 6
  • Thanks! Unfortunately I need minimum latency, so your solution isn't really solving my issue, however it's a smart approach and GPUView seems to be a very useful utility. Thus I'm awarding the bounty to you. – Isso Dec 02 '16 at 23:43
  • 1
    If you want minimum latency then you basically need to synchronize the CPU with the GPU to guarantee that the texture is read as soon as it is copied. This means that the GPU and the CPU would not work parallel and this is exactly what you were seeing in your original question. If I'm missing something then please let me know it looks like you are trying to satisfy two opposing requirements. – moradin Dec 13 '16 at 13:23
  • Could you show me how to synchronize the CPU and GPU in this manner? – ktb92677 Feb 09 '19 at 01:14
  • You just need to do exactly what the original question was describing. Call the Map function for read and it will make sure that all the work is finished on the GPU essentially synchronizing the CPU with it. – moradin Feb 10 '19 at 14:40
  • But like he mentioned the map functions is taking a really long time for me. Is there any other way to synchronize the CPU and GPU that isn't as time consuming as the Map function? – ktb92677 Feb 11 '19 at 19:09
  • The synchronization is the thing that takes time. The CPU needs to wait for the GPU to finish the rendering and copying work. You can't synchronize without paying the cost. – moradin Feb 12 '19 at 21:01
0

I had the same issue but because I am doing DXGI desktop duplication all the suggested methods here and those by Microsoft do not apply as there is no "render thread".

The solution however for both, which is far better then the other suggestions as it keeps latency to the absolute minimum is to try to map with D3D11_MAP_FLAG_DO_NOT_WAIT and check for DXGI_ERROR_WAS_STILL_DRAWING. You can do this in a tight loop and it won't stall the pipeline. To avoid consuming too much CPU time, put a sleep/delay into the loop.

Geoffrey
  • 10,843
  • 3
  • 33
  • 46