I have the following function to create my UAV and initialize a buffer for it:
template <typename T>
static inline void CreateInitializedUnorderedAccessView(
_In_ ID3D11Device* pDevice,
UINT count,
_In_ D3D11_SUBRESOURCE_DATA* pData,
_Out_ ID3D11Buffer** ppBuffer,
_Out_ ID3D11UnorderedAccessView** ppUav)
{
D3D11_BUFFER_DESC bufferDesc =
{
sizeof(T) * count, // ByteWidth
D3D11_USAGE_DEFAULT, // Usage
D3D11_BIND_UNORDERED_ACCESS, // BindFlags
0, // CPUAccessFlags
D3D11_RESOURCE_MISC_BUFFER_STRUCTURED, // MiscFlags
sizeof(T), // StructureByteStride
};
CHR(pDevice->CreateBuffer(
&bufferDesc,
pData,
ppBuffer));
D3D11_UNORDERED_ACCESS_VIEW_DESC uaView =
{
DXGI_FORMAT_UNKNOWN, // Format
D3D11_UAV_DIMENSION_BUFFER, // ViewDimension
};
uaView.Buffer.FirstElement = 0;
uaView.Buffer.Flags = 0;
uaView.Buffer.NumElements = count;
CHR(pDevice->CreateUnorderedAccessView(
*ppBuffer,
&uaView,
ppUav));
}
Ignore CHR, that's a macro for handling exceptions. T is usually a 128 bit type, e.g. __m128i or __n128.
When I call this function:
::CreateInitializedUnorderedAccessView<simd_4>(
this->device.Get(),
1, //this->dataStruct.SysMemPitch / sizeof(simd_4),
&this->dataStruct,
this->outputBuffer.GetAddressOf(),
this->outputView.GetAddressOf());
My shader does exactly what I expect it to do, and writes to the first 128 bits of the UAV's buffer which I verify later.
However, when I call CreateInitializedUnorderedAccessView with ANY number larger than 1 for count (e.g. 2,3,4, 16, or, ideally, this->dataStruct.SysMemPitch/sizeof(simd_4)), the UAV's buffer isn't written to at all which should be impossible from how I wrote the shader--although I'm not certain of that, so I'll include some of the relevant HLSL assembly here:
cs_5_0
dcl_globalFlags refactoringAllowed
dcl_constantbuffer cb0[2], immediateIndexed
dcl_constantbuffer cb1[1], immediateIndexed
dcl_uav_structured u0, 16
dcl_input vThreadID.xy
dcl_temps 25
dcl_thread_group 64, 1, 1
imad r24.w, vThreadID.y, l(0x003fffc0), vThreadID.x
if_z r24.w
mov r0.xyzw, l(0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff)
store_structured u0.xyzw, l(0), l(0), r0.xyzw
endif
...
which should, after running the shader, set the first 128 bits in the UAV to 1... but they are all 0 when I feed a count of anything other than 1 to my CreateInitializedUnorderedAccessView function. What could be causing this?