0

I'm getting an DXGI_ERROR_DEVICE_HUNG crash. I can get this to go away by taking out one of the three following

  • DirectML work.
  • ID3D12GraphicsCommandList::CopyResource whose destination is the readback buffer, source is downstream of the output of the DirectML work.
  • copying from the mapped readback buffer into a vector.

Edit: it turns out that my synchronization was actually good, I've even stopped frame buffering the render work, and I was already buffering the readback, so I'm as sure as I can be that the reads to the readback and writes from it are not happening at the same time.

I've stopped persistent mapping the readbacks, and now call map for each read.

All to no avail. No debug messages. No useful dread DRED info.

Tom Huntington
  • 2,260
  • 10
  • 20
  • 1
    Is your readback buffer mapped? Because CopyResource arguments cannot be currently mapped. – mateeeeeee Mar 23 '23 at 15:42
  • thanks @mateeeeeee it is mapped. The [CopyResource docs](https://learn.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12graphicscommandlist-copyresource) do say resources cant be mapped but I thought I was instructed to leave resources mapped in DX12. Hmmm – Tom Huntington Mar 23 '23 at 21:30
  • 1
    I would say the advice is keep them mapped if possible, for example, you can keep constant buffers or more generally, upload buffers mapped all the time. You can also keep mapped readback buffers in some scenarios, but in this one no (from what I see) – mateeeeeee Mar 23 '23 at 21:38
  • 1
    Can you post your code somewhere? It's hard to guess like this – mateeeeeee Mar 24 '23 at 09:04
  • @mateeeeeee I'm going to try to produce a MVCE now – Tom Huntington Mar 24 '23 at 19:04
  • I realized I'm reading from persistently mapped `HEAP_TYPE_UPLOAD` when I shouldn't be... I'll fix it on Monday morning – Tom Huntington Mar 25 '23 at 00:14

1 Answers1

1

Turns out that taking out the following calls also caused it to stop crashing:

  • ID3D12GraphicsCommandList::CopyResourcewhose source was a HEAP_TYPE_UPLOADresource.
  • taking out a function that was actually accidently reading from persistently mapped memory of this HEAP_TYPE_UPLOAD resource

The docs say this is really bad:

Applications should avoid CPU reads from pointers to resources on UPLOAD heaps, even accidently. CPU reads will work, but are prohibitively slow on many common GPU architectures

Tom Huntington
  • 2,260
  • 10
  • 20
  • 1
    If CPU reads and writes to a persistently mapped resource in an `UPLOAD` heap is absolutely necessary in an update frame, try inserting a fence signal to wait for the GPU immediately afterwards. This stopped my DXR application from crashing. – Maico De Blasio Mar 29 '23 at 02:37
  • 1
    @MaicoDeBlasio do you understand why? Seems like `UPLOAD` and `READBACK` own cpu side memory with the gpu reading and writing via PCIe bus. So some kind of gridlock occurs owing to advanced cpu bus issues – Tom Huntington Mar 29 '23 at 04:34
  • 1
    Yes I believe that's essentially it. When dealing with `UPLOAD` or `READBACK` heaps, you ultimately have to "pay the bus fare". – Maico De Blasio Mar 29 '23 at 07:42
  • 1
    Also see https://discord.com/channels/590611987420020747/590965902564917258/1090553960093462559 – Tom Huntington Mar 29 '23 at 09:05
  • 1
    just fyi, new agility sdk 1.710 features gpu upload heaps: https://devblogs.microsoft.com/directx/preview-agility-sdk-1-710-0/ – mateeeeeee Apr 01 '23 at 10:55