I'm trying to transform a texture to the frequency domain via a compute shader in unity/CG/hlsl, i.e. I'm trying to read pixel values from a texture and output an array of basis function coefficients. how would i go about that? i am really new to compute shader's so i'm a bit lost. I understand the reason for the race condition and how compute shaders divide the workload but is there any way to deal with this? In general documentation on buffers and other things seems a little underwhelming for someone without a background in the matter..
the error i am getting:
Shader error in 'Compute.compute': race condition writing to shared resource detected, consider making this write conditional. at kernel testBuffer at Compute.compute(xxx) (on d3d11)
a simplified example could be to sum all the pixel values, currently my approach would be as follows. I am attempting to use structuredbuffers since i don't know how else i would be able to retrieve the data or store it on gpu for global shader access afterwards??
struct valueStruct{
float4 values[someSize];
}
RWStructuredBuffer<valueStruct> valueBuffer;
// same behaviour if using RWStructuredBuffer<float3> valueBuffer;
// if using 'StructuredBuffer<float3> valueBuffer;' i get the error:
// Shader error in 'Compute.compute': l-value specifies const object at kernel testBuffer at Compute.compute(xxx) (on d3d11)
Texture2D<float4> Source;
[numthreads(8, 8, 1)]
void testBuffer(uint3 id : SV_DispatchThreadID) {
valueBuffer[0].values[0] += Source[id.xy]; // in theory the vaules
valueBuffer[0].values[1] += Source[id.xy]; // would be different
valueBuffer[0].values[2] += Source[id.xy]; // but it doesn't really
valueBuffer[0].values[3] += Source[id.xy]; // matter for this, so
valueBuffer[0].values[4] += Source[id.xy]; // they are just Source[id.xy]
//.....
}
The whole thing does not throw a race condition error thing if i unfold the buffer into single values like
float3 value0;
float3 value1;
float3 value2;
float3 value3;
float3 value4;
float3 value5;
float3 value6;
float3 value7;
float3 value8;
[numthreads(8, 8, 1)]
void testBuffer(uint3 id : SV_DispatchThreadID) {
value0 += Source[id.xy]; // in theory the vaules
value1 += Source[id.xy]; // would be different
value1 += Source[id.xy]; // but it doesn't really
value1 += Source[id.xy]; // matter for this, so
value1 += Source[id.xy]; // they are just Source[id.xy]
}
and don't use a structuredbuffer, but in that case i don't know how to retrieve the data after kernel dispatch. If it's down to the READ part of the RWStructuredBuffer i am using but what would be an equivalent buffer that i can only write to? Since i don't really read the data. Or is the general operator "+=" already causing a race condition no matter what?
from google i found that a solution might be using GroupMemoryBarrierWithGroupSync();
?? but i have no idea what this is (not to mention how it works) and in general google results are just flying a little over my head atm
could anyone provide an example of how to solve this issue? Otherwise i appreaciate any pointers.