Do I need to transfer ownership back to the transfer queue on next transfer?

Question

I'm planning on using one of the vulkan synchronization examples as reference for how to handle infrequently updated uniform buffers. Specifically I was looking at this one:

vkBeginCommandBuffer(...);

// Submission guarantees the host write being complete, as per
// https://www.khronos.org/registry/vulkan/specs/1.0/html/vkspec.html#synchronization-submission-host-writes
// So no need for a barrier before the transfer

// Copy the staging buffer contents to the vertex buffer
VkBufferCopy vertexCopyRegion = {
    .srcOffset = stagingMemoryOffset,
    .dstOffset = vertexMemoryOffset,
    .size      = vertexDataSize};

vkCmdCopyBuffer(
    commandBuffer,
    stagingBuffer,
    vertexBuffer,
    1,
    &vertexCopyRegion);


// If the graphics queue and transfer queue are the same queue
if (isUnifiedGraphicsAndTransferQueue)
{
    // If there is a semaphore signal + wait between this being submitted and
    // the vertex buffer being used, then skip this pipeline barrier.

    // Pipeline barrier before using the vertex data
    // Note that this can apply to all buffers uploaded in the same way, so
    // ideally batch all copies before this.
    VkMemoryBarrier memoryBarrier = {
        ...
        .srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT,                           
        .dstAccessMask = VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT};

    vkCmdPipelineBarrier(
        ...
        VK_PIPELINE_STAGE_TRANSFER_BIT ,      // srcStageMask
        VK_PIPELINE_STAGE_VERTEX_INPUT_BIT,   // dstStageMask
        1,                                    // memoryBarrierCount
        &memoryBarrier,                       // pMemoryBarriers
        ...);


    vkEndCommandBuffer(...);

    vkQueueSubmit(unifiedQueue, ...);
}
else
{
    // Pipeline barrier to start a queue ownership transfer after the copy
    VkBufferMemoryBarrier bufferMemoryBarrier = {
        ...
        .srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT,                           
        .dstAccessMask = 0,
        .srcQueueFamilyIndex = transferQueueFamilyIndex,
        .dstQueueFamilyIndex = graphicsQueueFamilyIndex,
        .buffer = vertexBuffer,
        ...};

    vkCmdPipelineBarrier(
        ...
        VK_PIPELINE_STAGE_TRANSFER_BIT ,      // srcStageMask
        VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT, // dstStageMask
        1,                                    // bufferMemoryBarrierCount
        &bufferMemoryBarrier,                 // pBufferMemoryBarriers
        ...);


    vkEndCommandBuffer(...);

    // Ensure a semaphore is signalled here which will be waited on by the graphics queue.
    vkQueueSubmit(transferQueue, ...);

    // Record a command buffer for the graphics queue.
    vkBeginCommandBuffer(...);

    // Pipeline barrier before using the vertex buffer, after finalising the ownership transfer
    VkBufferMemoryBarrier bufferMemoryBarrier = {
        ...
        .srcAccessMask = 0,                           
        .dstAccessMask = VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT,
        .srcQueueFamilyIndex = transferQueueFamilyIndex,
        .dstQueueFamilyIndex = graphicsQueueFamilyIndex,
        .buffer = vertexBuffer,
        ...};

    vkCmdPipelineBarrier(
        ...
        VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,    // srcStageMask
        VK_PIPELINE_STAGE_VERTEX_INPUT_BIT,   // dstStageMask
        ...
        1,                                    // bufferMemoryBarrierCount
        &bufferMemoryBarrier,                 // pBufferMemoryBarriers
        ...);


    vkEndCommandBuffer(...);

    vkQueueSubmit(graphicsQueue, ...);
}

In this example, I simplify it to mean:

map updated buffer which is host coherent
perform transfer in transfer queue to device local memory
    make sure to put a buffer memory barrier to handle the queue ownership transfer
perform normal draw commands
    make sure to put a buffer memory barrier to handle receiving of buffer in queue ownership

Must I then give back the ability for the transfer queue to copy that data again? None of the examples seem to mention it, but I could have missed it. I can't really see how adding another buffer barrier would work for the same draw command buffer as it would stall on the next submission even if I didn't have anything to transfer, so would I just need to submit another command buffer just for queue ownership transfer before I submit my next transfer operation?

ie

//begin with transfer ownership
submit(copy)
submit(ownership to graphics)
submit(draw)
submit(ownership to transfer)
submit(copy)
submit(ownership to graphics)
submit(draw)
submit(draw)
submit(draw)
submit(ownership to transfer)
submit(copy)
submit(ownership to graphics)
submit(draw)

If that is the case, I'm unsure of how to handle the semaphore signaling between draw and transfer, and copy and draw. At the beginning it is easy, but then it gets strange because of multiple i nflight frames, as there would be no dependency between draw submits. Basically I guess I would need to set what ever the most recent draw command I submitted to have a semaphore to signal the transfer of ownership, which would signal the copy, which would signal the ownership of graphics, and if it was on a separate thread I would then check if this copy was submitted, and require a wait on the ownership of graphics transfer and resetting the copy submitted check. But I'm not sure what happens to the next frame which doesn't have this dependency, and could finish before what would chronologically be the previous frame?

krOoze · Accepted Answer · 2020-02-20T01:26:57.310

5

You can use a Resource on any queue family (without a transfer) as long as you do not mind that the data becomes undefined. You still need a Semaphore to make sure there is no memory hazard.

Ye olde spec:

NOTE

If an application does not need the contents of a resource to remain valid when transferring from one queue family to another, then the ownership transfer should be skipped.

Examples do not mention it, because they are only examples.

As for synchronization (which is a separate problem from QFOT), a semaphore signal which is part of vkQueueSubmit covers everything previously in submission order. So when you submit the copy you would let it wait on the Semaphore that the last draw submitted has signaled. That means that draw and any previous draw on that queue is finished, before the copy can start on the other queue.

Then you signal a semaphore by the copy, and wait on it on the first next draw you submit. That means the copy finished writing, before the draw (and any subsequent draw) reads it on the graphics queue.

e.g.:

submit(copy, release ownership from tranfer, semaphore signal)
submit(semaphore wait, acquire ownership to graphics, draw)
submit(draw)
submit(draw)
submit(draw)
submit(draw)
submit(draw, semaphore signal)
submit(semaphore wait, copy, release ownership from tranfer, semaphore signal)
submit(semaphore wait, acquire ownership to graphics, draw)
submit(draw)
submit(draw)
etc

Though note the above approach practically serializes the two kinds of accesses, so it might be suboptimal. Employing double-buffering (or generally N-buffering) can be better. If you have more buffers, you could start copying into one without worying it is already used by something else. That means a copy can happen in paralel with a draw, which would be great.

edited Feb 20 '20 at 01:26

answered Feb 19 '20 at 23:03

krOoze

12,301
1
20
34

Ahhh, so because I don't need the resources to remain valid between the draw and copy, I can skip trying to transfer ownership to the copy in the transfer queue, does ownership need to be transferred still afterwards though? In my scenario doesn't it still need to be kept defined within the draw functionality (imagine the it was some data, like a color parameter for an object, only set periodically from user side, not frequently updated) – Krupip Feb 19 '20 at 23:53
1

Yuppity yup. You need to perform a QFOT from the copy to the draw that uses the result of that copy (i.e. do "release" Barrier on the transfer qf, then Semaphore, then matching "acquire" Barrier on the graphics qf). After the draw you can (and should) only "steal" the Buffer back to the transfer qf. That is done by only separating the two accesses by a Semaphore. (The data then has undefined contents, but you do not care if you are overwriting them anyway. The Semaphore still has to be there though, so the draw finishes reading before the copy tries to write the Buffer.) – krOoze Feb 20 '20 at 00:35
How does that work if there are multiple draws inflight? The transfer copy could complete while another already submitted draw is already running, despite a semaphore between that copy and the very next draw command, wouldn't that cause issues? – Krupip Feb 20 '20 at 00:47
@whn Things should be well-ordered by your app with Semaphores in a way that memory hazard does not happen. If you copy stuff to the Buffer, and at the same time some draw can read it, then that is mis-synchronized. Either you need to serialize these accesses with sync, or you need to employ some N-buffering. And only way to Sync two different Queues is with a Semaphore (or more brutally with a Fence). – krOoze Feb 20 '20 at 01:05
Okay, so basically if I wanted to change it while they were running I'd have to use some sort of double buffering scheme, otherwise, I insert it into the semaphore chain (chain linked to swap chain images which are linked via semaphores to my actual rendering) right (and obviously do the rest of the stuff you said, don't transfer to copy, insert a memory barrier to transfer from the copy on both ends, synchronize semaphore before and after)? – Krupip Feb 20 '20 at 02:29
@krOoze Can you perform this stealing method (or something more relaxed than a full QFOT) when an image layout transition is also required? – plasmacel Nov 29 '20 at 02:58
1

@plasmacel Yes. The "stealing" is simply omitting the two QFOT barriers, which results in the data being undefined on the new queue. Then on the new queue you would simply start using the resource; i.e. in your case you would do a regular barrier to change the layout from `VK_IMAGE_LAYOUT_UNDEFINED` to whatever you want. – krOoze Nov 29 '20 at 08:08
@krOoze Could you take a look on https://stackoverflow.com/q/69439966/2430597? Somewhat related to this. – plasmacel Oct 04 '21 at 19:00

Do I need to transfer ownership *back* to the transfer queue on next transfer?

1 Answers1

Do I need to transfer ownership back to the transfer queue on next transfer?