5

I was wondering if an attachment is used as both input attachment and color/ds attachment, a drawcall read from the input attachment then write to the same color/ds attachment, is it allowed? If the next drawcall is also doing the same thing, from the spec I see I need a vkCmdPipelineBarrier to make the next drawcall fetch correct results, but not sure about the same drawcall case.

Another question is can I use input attachment in the first subpass? like I attach the depth texture generated from a pre-z pass as depth attachment and input attachment?

painkiller
  • 155
  • 1
  • 5

2 Answers2

5

It is possible to perform a read/modify/write (RMW) for the same image through color/input attachments in a shader, so long as:

  1. You ensure that exactly one fragment shader will perform the RMW for a particular output value in the color attachment. This basically boils down to "no overdraw".

  2. If you need to have overdraw (ie: multiple FS's doing repeated RMW operations to the same input/output), then between each set of overdrawing operations within a subpass, you must have a pipeline barrier. So you have to break up your rendering commands into small chunks. Note that for the barrier to work, you have to have a subpass self-dependency as part of this subpass's dependency graph, and the barrier needs to invoke it. Also, your self-dependency ought to be per-region, since you only care about the dependency between individual locations on the screen. You can't random-access input attachments, after all.

You can use any attachment as an input attachment on any subpass, so long as it makes sense to do so. If your loadOp said that you don't want to load data, then obviously it doesn't make sense to read from an image that has undefined values.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • 1
    For a tile-based GPU the requirement is stronger than "exactly one fragment will perform RMF per output value". Because you have no guarantees of what order tiles are processed, you also have a requirement that the read input coordinate == written output coordinate. Any RMF that crosses outside of the current tile boundary is going to get undefined results. – solidpixel Dec 18 '19 at 18:24
  • 2
    @solidpixel: That's the requirement in OpenGL. Vulkan does not have that requirement because, *by definition* it is impossible to read from an input attachment from some other fragment. You simply cannot do it. That's (part of) why input attachments are a different type of thing from just a uniform. – Nicol Bolas Dec 18 '19 at 18:25
  • 3
    @solidpixel: Specifically: "The (u,v) coordinates used for a SubpassData must be the of a constant vector (0,0)" [from Vulkan](https://www.khronos.org/registry/vulkan/specs/1.1-khr-extensions/html/chap36.html#spirvenv) and "When the Image Dim operand is SubpassData, Coordinate is relative to the current fragment location," from [SPIR-V in OpImageRead](https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpImageRead). Therefore, in Vulkan, it is not possible to read from an input attachment *except for* from the current fragment's position. – Nicol Bolas Dec 18 '19 at 18:33
  • Good catch - I was most definitely thinking of OpenGL ES and reading a framebuffer attachment also as a texture here, sorry. – solidpixel Dec 19 '19 at 15:55
1

Using an attachment as both input attachment and color or depth/stencil attachment is known as a feedback loop, and essentially you get undefined results if you both read and write to the same parts of it without a pipeline barrier in between. Since you can't have a pipeline barrier within a draw call, you're out of luck.

You can use feedback loops in a well-defined way if all accesses are reads (e.g. depth test enabled but depth writes disabled) or for color attachments if the reads and writes access disjoint components (using color write mask).

For your second question, yes, an input attachment doesn't have to have been written earlier in the same renderpass. Though in your example, it might be best to do the z pre-pass in a first subpass and then use it as input attachment and read-only depth test in the second subpass. On tiled architectures, this might save bandwidth since the depth buffer would never have to be written to memory.

Jesse Hall
  • 6,441
  • 23
  • 29
  • Thanks so much for the answer! for the single drawcall case, how about it has no overdraw for each fragment, like a full screen quad, then each fragment only reads and writes to the attachment once, feels like safe to do? – painkiller Dec 18 '19 at 06:24
  • 1
    Yes, that works -- you don't have read-after-write on the same data location in that case. – Jesse Hall Dec 18 '19 at 15:42
  • 1
    On the topic of tile-based GPUs, I'd also note that most tile-based GPUs have some form of hidden surface removal. If the use of a pre-pass is just for reducing overdraw cost, I'd go as far as saying "don't use a prepass", or at least benchmark it to make sure it actually helps. For most content I've seen on mobile the cost of the additional vertices needed for a prepass outweighs the saving of the reduced overdraw. – solidpixel Dec 18 '19 at 18:22
  • 1
    @JesseHall: "*Yes, that works*" Then maybe you should put that in your answer. As it stands, you're saying "no you can't do that", when the answer really is "you can, within certain restrictions." And the restrictions you cite aren't them. – Nicol Bolas Dec 19 '19 at 15:57