1

So I am attempting to dive into DirectX12 synchronization. And I am having a hard time trying to understand why some programs need multiple fences for GPU/CPU synchronization, instead of just one.

In this guide, they have only a single fence and single fence value (that is stored per frame, so that the the value can be checked for a certain frame).

While in this guide they have a a fence for each frame and an array of current values for each frame. (This guide doesn't explain its reasoning for have a fence for each frame)

What is the correct way to do this? (For simplicity just assume it is a single threaded program with 3 back buffers in the swap chain, but if you have the knowledge for a multithreaded program too, that additional information would be helpful). What are the pros and cons of each way if they are both valid approaches?

Also why is it that we use a single fence event instead of one for each frame as well? Like in the below:

    if (fence->GetCompletedValue() < fenceValue)
    {
        ThrowIfFailed(fence->SetEventOnCompletion(fenceValue, fenceEvent)); 
        ::WaitForSingleObject(fenceEvent, static_cast<DWORD>(duration.count()));
    }

(I understand that the signals are appended to the end of command list in the command queues) So are fence events queued up the Windows OS? Is that why we can reuse it or is it because we will only be waiting on one event at a time ever, which is why we only need 1.

Thanks for the help in advanced. I am trying to get a strong understanding of the basics before i move on to more complex synchronization.

Edit: Found this link talking about some more synchronization: link in case people were interested. Doesn't answer the question though

Edit2: This following link is with Vulkan, but also uses a fence per frame in flight. But doesn't give a reason either.

yosmo78
  • 489
  • 4
  • 13

1 Answers1

2

#directxtk does it with a single fence, and a fence value for each frame.

I believe you need fence values for each frame, because you want to know when the next frame is ready, not when the previous frame is completely finished (which would let you use a single global fence value). The call to a fence's GetCompletedValue() will tell you if the fence reached the fence value necessary to consider the next frame ready to be rendered.

Since the frames rotate with the back buffers, you want as many fence values as back buffers. But you don't necessarily need multiple fences. You can make it work with a single fence for a given command allocator.

Any time you need something fenced, you get the current fence value which you'd previously stored for the current back buffer in m_fenceValues[m_backBufferIndex]. Then you increment by 1, and tell the GPU via Signal() to please set the fence to this fence value when all current commands are executed. When you want to check that they've been executed, you compare to this fence value. If it's gone to the new number, then the GPU has done what it needed. Store that new fence value back in m_fenceValues[m_backBufferIndex] and you're done.

This can be done at any time, within a frame or for moving to the next frame. When you want to move to the next frame, you do the same thing, except that you store the (fence value +1) in the fence value of the current back buffer which, by now, has become the new back buffer.

Here's the relevant code for moving to the next frame:

// Prepare to render the next frame.
void DeviceResources::MoveToNextFrame()
{
    // Schedule a Signal command in the queue.
    const UINT64 currentFenceValue = m_fenceValues[m_backBufferIndex];
    ThrowIfFailed(m_commandQueue->Signal(m_fence.Get(), currentFenceValue));

    // Update the back buffer index.
    m_backBufferIndex = m_swapChain->GetCurrentBackBufferIndex();

    // If the next frame is not ready to be rendered yet, wait until it is ready.
    if (m_fence->GetCompletedValue() SetEventOnCompletion(m_fenceValues[m_backBufferIndex], m_fenceEvent.Get()));
        WaitForSingleObjectEx(m_fenceEvent.Get(), INFINITE, FALSE);
    }

    // Set the fence value for the next frame.
    m_fenceValues[m_backBufferIndex] = currentFenceValue + 1;
}

If you are multithreading with multiple command allocators, you'll want as many fences as you have command allocators.

Rikkles
  • 3,372
  • 1
  • 18
  • 24