0

So after watching the dx12 binding videos and reading through some docs, I'm not 100% sure if I understood correctly how to manage my heaps.

Let me explain what I wan't to achieve in my application: During the initialisation, I'll be filling two heaps, one holding Samplers and the other one holding SRV, CBV and UAV. Those heaps will contain all the resources the application will be using during its life time.

Now starts the interesting part. To build the Root Signatures, I'll be using for the most part Root Descriptor Tables.

As we know, a table will hold ranges, a range being a base shader slot, number of descriptors and other settings. Let me show you and example:

Root Parameters
0 - root_table 
1 - root_table

0 root_table
CBV b1
CBV b6
SRV t0
SRV t2

1 root_table
Sampler s1
Sampler s4

As shown in the example, there can be ranges that are non sequential (for example b0,b1,b2 and b3) but, during command list recording, we can only do:

ID3D12DescriptorHeaps* heaps[2] = {mCbvSrvUavHeap,mSamplerHeap};
mCmdList->SetDescriptorHeaps(2,heaps);

mCmdList->SetGraphicsRootDescriptorTable(0, mCbvSrvUavHeapGpuHanleStart);
mCmdList->SetGraphicsRootDescriptorTable(1, mSamplerHandleHanleStart);

That means that we will need to have the descriptors properly ordered in mCbvSrvUavHeap and mSamplerHeap.

For example:

mCbvSrvUavHeap 
CBV
CBV
SRV
SRV

Here is where the problem is for me. As I initially said, I'll be creating two big heaps with all the resource for the application, but, I cannot set those heaps to the command list as they will have other descriptors that won't be used!

How can I handle this? Do I need to make a new Heap containing only the descriptors I will be using?

Hope I explained it well!

Nacho
  • 89
  • 8
  • They don't need to be continuous if you are willing to use additional rootsig slots. [Here's](https://github.com/Microsoft/DirectXTK12/blob/master/Src/Shaders/RootSig.fxh) the rootsigs I used in [DirectX Tool Kit for DirectX 12](https://github.com/Microsoft/DirectXTK12). I'm not saying they are optimal, but they get the job done. – Chuck Walbourn Jul 25 '17 at 20:21

1 Answers1

1

You are understanding it wrong. A descriptor heap is not something immutable but an always changing object. When you bind a descriptor table, you are in fact binding it from any offset. Swapping descriptor heaps is a costly operation you want to avoid at all cost.

The idea is to prepare the descriptors in non GPU visible heaps ( as many as you like, they are merely a CPU allocated object ) and copy on demand into the GPU visible one in a ring buffer fashion way with CopyDescriptor or CopyDescriptorSimple.

Let say your shader use a table with 2 CBVs and 2 SRVs, they have to be continuous in the heap, so you will allocate from your heap an array of 4, you get a heap offset, copy the needed descriptors and bind it with SetGraphicsRootDescriptorTable.

One thing you will have to be careful is the life duration of the descriptor in you heap, as you cannot overwrite them until the GPU is done processing the commands using them. And last, if many shader share some common tables, from similar root signature, you can save on processing by factorizing the updates.

galop1n
  • 8,573
  • 22
  • 36
  • This is certainly one strategy, but changing descriptor heaps is not necessarily the end of the world depending on what else is happening. Yes it causes a pipeline flush on some hardware, but it may not be all that bad since you have to trade it off against other vendor best practices. – Chuck Walbourn Jul 25 '17 at 20:18
  • For nvidia for example ( 70% PC market ), `Make sure to use just one CBV/SRV/UAV/descriptor heap as a ring-buffer for all frames if you want to aim at running parallel asynchronous compute and graphics workloads`, my understanding is that they will barrier between descriptor heaps, it would be like inserting a `WaitForIdle` between each Draw calls and loose all chance of parallelism. – galop1n Jul 25 '17 at 20:29
  • I had in mind the idea of having a ring buffer behind the scenes so I could do double/triple buffering without having to fence each frame :). Thanks for the explanation it is really helpful. I think I will do the following: Make a big Heap (non shader visible) with the SRV_UAV_CBV and another one for Samplers and then, build a "current" descriptor heap with the descriptors for the next draw/dispatch. Once that I´m done with that new heap, I should call ID3D12Descriptorheap::Release() right? – Nacho Jul 25 '17 at 22:57
  • Creating and destroying heaps (or any resource) is far more expensive than setting them. Any value this approach has is likely to be lost if you go about creating a new heap object every frame... – Chuck Walbourn Jul 26 '17 at 05:41
  • Ok, so how could I work around that if I'm going to have one big heap but I need to select which descriptors I'm going to use, any advice? – Nacho Jul 26 '17 at 08:52
  • @galop1n Could you check this: [GyazoPic](https://gyazo.com/0e396377031b76f132a27578ed0ea02e) that's what I'm trying to build. Is that OK? – Nacho Jul 26 '17 at 14:47
  • @Nacho Reread my post, this is exactly what i describe (except that your source cpu descriptor are in a big allocated heap). The descriptor table binding is not restrict to the start of a heap, it can be offsetted, so let say your shader need a table with 4 views that are at offset x y z and w. You allocate in your frame heap an array of 4, put xyzw together – galop1n Jul 26 '17 at 15:11
  • @galop1n Yep that's basically the same that you said. I'm not sure if I should really also have 3 versions of the "Frame Heap" or that's just me overdoing it. By the "Frame Heap" I mean the actual heap that I will provide to the command list. – Nacho Jul 26 '17 at 15:58
  • @Nacho This is a personal taste at that point, you can have a few heaps, cycling per frame (you should already have that for your allocators right ?), or a larger heap and ring buffer allocate in it. I do a hybrid version personally, a large heap with a portion reserved for static table and the remaining used as a ring buffer. As long as you do not overwrite a descriptor in use, you are good. – galop1n Jul 26 '17 at 16:42
  • @galop1n Alright thanks for the info, I think I have to read through your comments a few more times and also let all the information sit in my head! Today I did some progress but I think I messed some things of my implementation. Next time I'll try to plan it a bit better. – Nacho Jul 26 '17 at 17:05
  • @galop1n There is still one thing that bothers me,you mentioned in your answer that changing Descriptor Heaps is expensive, but, with this approach I'll be changing the heaps each time I "apply" a material ( * number of drawcalls/dispatch with the material). I'm missing something here? – Nacho Jul 27 '17 at 14:14
  • @Nacho The costly API is `SetDescriptorHeaps` and should not be called more than 2-3 times a frame, and certainly not in the middle of a command list to change them. Calls to `CopyDescriptors`, and all the `Set[Something]View` are pretty fast and ok and so the dynamic logic to deal with tables. – galop1n Jul 27 '17 at 16:29
  • @galop1n Well, but I'll be generating tons of new heaps for each effect apply. How could I avoid calling to SetDescriptorHeaps many times? – Nacho Jul 27 '17 at 16:39
  • @Nacho Sorry but this goes nowhere. Reread the answer and comments, you do not create new heaps, you create tables in a single heap. Good luck – galop1n Jul 27 '17 at 16:48
  • @galop1n Sorry for not understanding it that fast, I'm mixing different things and ideas and I'm a bit confused right now. Could you please explain me, how I can I offset the tables, if you SetGraphicsRootDescriptorTable() it takes the root index and the GPU pointer. I can offset that pointer no problem but I still need to generate a descriptor heap with the heaps in the correct order. – Nacho Jul 27 '17 at 17:04
  • @Nacho, I would start a chat, but you need >20 reputation for that. Forget creating descriptor heaps all the time, you can do everything by creating 2 larges no more, one for the cpu, one for the gpu. `SetDescriptorHeaps` is called only at the start of a command list! When a shader use a descriptor TABLE, all you need is `SetGraphicsRootDescriptorTable(i,start)`. with `auto start = gpuHeap->GetGPUDescriptorHandleForHeapStart(); start.Ptr += offset * device->GetDescriptorHandleIncrementSize(type);` `Bx Cx Ux` registers in a shader are not important, it is the order in the TABLE that matter. – galop1n Jul 27 '17 at 22:55
  • @galop1n Hi! I think I got it! Basically, I wasn't understanding the "two heaps" part. It's one "non shader visible" and the other one "shader visible", I was just thinking that those two heaps where my initial heaps. Thanks again for your help and time, much appreciated. – Nacho Jul 28 '17 at 08:50
  • @galop1n Last question! (Please don't be mad at me) I want to have my Constant Buffers in DescTables instead of RootDesc. This implies, that I have to do the versioning for the data myself. Each CB is mapped and has an upload heap and I also generate a descriptor for it at my "main heap". But, for consecutive changes of data of the CB during the same frame, I've been offsetting the memcpy so I didn't override data in-use. The point is, I only have 1 Descriptor for this CB but I'll be appending more data for each draw. How can I inform DX12 that it has to take the descriptor from "x" offset? – Nacho Jul 28 '17 at 12:45
  • @Nacho As said, the `Create*View` API are fast, if you have dynamic constant buffer, you call `CreateConstantBufferView` to the heap location you have to put it in for your shader to read it. – galop1n Jul 28 '17 at 15:56
  • @galop1n I tried it and seems to be working, I´ll keep working on it, thanks! – Nacho Jul 28 '17 at 18:40