i'm rendering single-pixel points into a uint32-texture with a compute shader. the texture is a 3d texture, x and y are viewport coordinates, z has depth information on coordinate 0 and additional attributes on 1. so two manually built rendertargets, if you will. code looks like this:
layout (r32ui, binding = 0) coherent volatile uniform uimage3D renderBuffer;
layout (rgba32f, binding = 1) restrict readonly uniform imageBuffer pointBuffer;
for(int j = 0; j < numPoints / gl_WorkGroupSize.x + 1; j++)
{
vec4 point = imageLoad(pointBuffer, ...)
// ... transform point ...
uint originalDepth = imageAtomicMin(renderBuffer, ivec3(imageCoords, 0), point.depth);
if (originalDepth >= point.depth)
{
// write happened, store the attributes
imageStore(renderBuffer, ivec3(imageCoords, 1), point.attributes);
}
}
while the depth values are correct, i have a few pixels where the attributes flicker between two values.
the order of points in the pointBuffer is random (but i've verified the set of all points is always the same), so my first thought was that two equal depth values might change the output, depending on which one comes first. so i made it that, if originalDepth == point.depth
it uses imageAtomicMax
to always have the same of the two alternative attributes written, but that changed nothing.
i scattered barrier()
and memoryBarrier()
all over the place, but that changed nothing. i also removed all diverging control flow for this, changed nothing.
reducing the local work size to 32 removes 90% of the flickering, but some still remains.
any ideas would be greatly appreciated.
edit: before you ask why i do this stuff manually instead of using normal rasterization and fragment shaders, the reason is performance. the rasterizer does not help since i'm rendering single-pixel-points, shared memory greatly speeded things up, and i render each point multiple times, which required me to use a geometry shader which was slow.