1

I am trying to write the simplest possible compute shader in DirectX12 so that I can have a starting point for a real project. However, it seems like no matter what I do I am unable to get my GPU to process "1+1" and see the output. As there is almost no documentation on compute shaders, I figured my only option now is to query StackOverflow.

I wrote the following code using the D3D12nBodyGravity project. First I copied as much of the code over as verbatim as possible, fixed "small" things, and then once it was all working I started trimming the code down to the basics. I am using Visual Studio 2019.

myClass.cpp:

#include "pch.h"
#include "myClass.h"
#include <d3dcompiler.h> // D3DReadFileToBlob
#include "Common\DirectXHelper.h" // NAME_D3D12_OBJECT

#include "Common\Device.h"
#include <iostream>

// InterlockedCompareExchange returns the object's value if the 
// comparison fails.  If it is already 0, then its value won't 
// change and 0 will be returned.
#define InterlockedGetValue(object) InterlockedCompareExchange(object, 0, 0)

myClass::myClass() 
: m_frameIndex(0)
, m_UavDescriptorSize(0)
, m_renderContextFenceValue(0)
, m_frameFenceValues{} {
    std::cout << "Initializing myClass" << std::endl;
    
    m_FenceValue = 0;
    //std::cout << "Calling DXGIDeclareAdapterRemovalSupport()" << std::endl;
    //DX::ThrowIfFailed(DXGIDeclareAdapterRemovalSupport());



    // Identify the device
    std::cout << "Identifying the device" << std::endl;
    auto m_device = Device::Get().GetDevice();

    std::cout << "Leading the rendering pipeline dependencies" << std::endl;
    // Load the rendering pipeline dependencies.
    {
        std::cout << "   Creating the root signatures" << std::endl;
        // Create the root signatures.
        {
            CD3DX12_ROOT_PARAMETER rootParameter;
            rootParameter.InitAsUnorderedAccessView(0);

            Microsoft::WRL::ComPtr<ID3DBlob> signature;
            Microsoft::WRL::ComPtr<ID3DBlob> error;
            
            CD3DX12_ROOT_SIGNATURE_DESC computeRootSignatureDesc(1, &rootParameter, 0, nullptr);
            DX::ThrowIfFailed(D3D12SerializeRootSignature(&computeRootSignatureDesc, D3D_ROOT_SIGNATURE_VERSION_1, &signature, &error));
            
            DX::ThrowIfFailed(m_device->CreateRootSignature(0, signature->GetBufferPointer(), signature->GetBufferSize(), IID_PPV_ARGS(&m_computeRootSignature)));
        }


        // Describe and create the command queue.
        std::cout << "   Describing and creating the command queue" << std::endl;
        D3D12_COMMAND_QUEUE_DESC queueDesc = {};
        queueDesc.Flags = D3D12_COMMAND_QUEUE_FLAG_NONE;
        queueDesc.Type = D3D12_COMMAND_LIST_TYPE_DIRECT;

        DX::ThrowIfFailed(m_device->CreateCommandQueue(&queueDesc, IID_PPV_ARGS(&m_commandQueue)));
        NAME_D3D12_OBJECT(m_commandQueue);

        std::cout << "   Creating descriptor heaps" << std::endl;
        // Create descriptor heaps.
        {
            // Describe and create a shader resource view (SRV) and unordered
            // access view (UAV) descriptor heap.
            D3D12_DESCRIPTOR_HEAP_DESC UavHeapDesc = {};
            UavHeapDesc.NumDescriptors = DescriptorCount;
            UavHeapDesc.Type = D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV;
            UavHeapDesc.Flags = D3D12_DESCRIPTOR_HEAP_FLAG_SHADER_VISIBLE;
            DX::ThrowIfFailed(m_device->CreateDescriptorHeap(&UavHeapDesc, IID_PPV_ARGS(&m_UavHeap)));
            NAME_D3D12_OBJECT(m_UavHeap);

            m_UavDescriptorSize = m_device->GetDescriptorHandleIncrementSize(D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV);
        }

        std::cout << "   Creating a command allocator for each frame" << std::endl;
        // Create a command allocator for each frame.
        for (UINT n = 0; n < FrameCount; n++) {
            DX::ThrowIfFailed(m_device->CreateCommandAllocator(D3D12_COMMAND_LIST_TYPE_DIRECT, IID_PPV_ARGS(&m_commandAllocators[n])));
        }
    } // Load the rendering pipeline dependencies.


    std::cout << "Loading the sample assets" << std::endl;
    // Load the sample assets.
    {
        std::cout << "   Creating the pipeline states, including compiling and loading shaders" << std::endl;
        // Create the pipeline states, which includes compiling and loading shaders.
        {
            Microsoft::WRL::ComPtr<ID3DBlob> computeShader;

#if defined(_DEBUG)
            // Enable better shader debugging with the graphics debugging tools.
            UINT compileFlags = D3DCOMPILE_DEBUG | D3DCOMPILE_SKIP_OPTIMIZATION;
#else
            UINT compileFlags = 0;
#endif

            // Load and compile the compute shader.
            DX::ThrowIfFailed(D3DReadFileToBlob(L"ComputeShader.cso", &computeShader));
            auto convert_blob_to_byte = [](Microsoft::WRL::ComPtr<ID3DBlob> blob) {
                auto* p = reinterpret_cast<unsigned char*>(blob->GetBufferPointer());
                auto n = blob->GetBufferSize();
                std::vector<unsigned char> buff;
                buff.reserve(n);
                std::copy(p, p + n, std::back_inserter(buff));
                return buff;
            };
            std::vector<BYTE> m_computeShader = convert_blob_to_byte(computeShader);

            // Describe and create the compute pipeline state object (PSO).
            D3D12_COMPUTE_PIPELINE_STATE_DESC computePsoDesc = {};
            computePsoDesc.pRootSignature = m_computeRootSignature.Get();
            computePsoDesc.CS = CD3DX12_SHADER_BYTECODE(computeShader.Get());

            DX::ThrowIfFailed(m_device->CreateComputePipelineState(&computePsoDesc, IID_PPV_ARGS(&m_computeState)));
            NAME_D3D12_OBJECT(m_computeState);
        }

        std::cout << "   Creating the command list" << std::endl;
        // Create the command list.
        DX::ThrowIfFailed(m_device->CreateCommandList(0, D3D12_COMMAND_LIST_TYPE_DIRECT, m_commandAllocators[m_frameIndex].Get(), m_computeState.Get(), IID_PPV_ARGS(&m_commandList)));
        NAME_D3D12_OBJECT(m_commandList);

        std::cout << "   Initializing the data in the buffers" << std::endl;
        // Initialize the data in the buffers.
        {
            data.resize(2);
            for (unsigned int i = 0; i < data.size(); i++) {
                data[i] = 0.0f;
            }
            const UINT dataSize = data.size() * sizeof(data[0]);

            D3D12_HEAP_PROPERTIES defaultHeapProperties = CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_DEFAULT);
            D3D12_HEAP_PROPERTIES uploadHeapProperties = CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD);
            D3D12_HEAP_PROPERTIES readbackHeapProperties = CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_READBACK);
            D3D12_RESOURCE_DESC bufferDesc = CD3DX12_RESOURCE_DESC::Buffer(dataSize, D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS);
            D3D12_RESOURCE_DESC uploadBufferDesc = CD3DX12_RESOURCE_DESC::Buffer(dataSize);
            readbackBufferDesc = CD3DX12_RESOURCE_DESC::Buffer(dataSize);


            DX::ThrowIfFailed(m_device->CreateCommittedResource(
                &defaultHeapProperties,
                D3D12_HEAP_FLAG_NONE,
                &bufferDesc,
                D3D12_RESOURCE_STATE_COPY_DEST,
                nullptr,
                IID_PPV_ARGS(&m_dataBuffer)));
            m_dataBuffer.Get()->SetName(L"m_dataBuffer");
                
            DX::ThrowIfFailed(m_device->CreateCommittedResource(
                &uploadHeapProperties,
                D3D12_HEAP_FLAG_NONE,
                &uploadBufferDesc,
                D3D12_RESOURCE_STATE_GENERIC_READ,
                nullptr,
                IID_PPV_ARGS(&m_dataBufferUpload)));
            m_dataBufferUpload.Get()->SetName(L"m_dataBufferUpload");
                
            DX::ThrowIfFailed(m_device->CreateCommittedResource(
                &readbackHeapProperties,
                D3D12_HEAP_FLAG_NONE,
                &readbackBufferDesc,
                D3D12_RESOURCE_STATE_COPY_DEST,
                nullptr,
                IID_PPV_ARGS(&m_dataBufferReadback)));
            m_dataBufferReadback.Get()->SetName(L"m_dataBufferReadback");
                
            NAME_D3D12_OBJECT(m_dataBuffer);

            dataSubResource = {};
            dataSubResource.pData = &data[0];
            dataSubResource.RowPitch = dataSize;
            dataSubResource.SlicePitch = dataSubResource.RowPitch;

            UpdateSubresources<1>(m_commandList.Get(), m_dataBuffer.Get(), m_dataBufferUpload.Get(), 0, 0, 1, &dataSubResource);
            m_commandList->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(m_dataBuffer.Get(), D3D12_RESOURCE_STATE_COPY_DEST, D3D12_RESOURCE_STATE_COMMON));

            m_commandList->CopyResource(m_dataBufferReadback.Get(), m_dataBufferUpload.Get());

            D3D12_UNORDERED_ACCESS_VIEW_DESC uavDesc = {};
            uavDesc.Format = DXGI_FORMAT_UNKNOWN;
            uavDesc.ViewDimension = D3D12_UAV_DIMENSION_BUFFER;
            uavDesc.Buffer.FirstElement = 0;
            uavDesc.Buffer.NumElements = 1;
            uavDesc.Buffer.StructureByteStride = sizeof(data[0]);
            uavDesc.Buffer.CounterOffsetInBytes = 0;
            uavDesc.Buffer.Flags = D3D12_BUFFER_UAV_FLAG_NONE;

            CD3DX12_CPU_DESCRIPTOR_HANDLE uavHandle0(m_UavHeap->GetCPUDescriptorHandleForHeapStart(), Uav, m_UavDescriptorSize);
            m_device->CreateUnorderedAccessView(m_dataBuffer.Get(), nullptr, &uavDesc, uavHandle0);
        } // Initialize the data in the buffers.

        std::cout << "   Closing the command list and executing it to begind the initial GPU setup" << std::endl;
        // Close the command list and execute it to begin the initial GPU setup.
        DX::ThrowIfFailed(m_commandList->Close());
        ID3D12CommandList* ppCommandLists[] = { m_commandList.Get() };
        m_commandQueue->ExecuteCommandLists(_countof(ppCommandLists), ppCommandLists);

        std::cout << "   Creating synchronization objects and wait until assets have been uploaded to the GPU" << std::endl;
        // Create synchronization objects and wait until assets have been uploaded to the GPU.
        {
            DX::ThrowIfFailed(m_device->CreateFence(m_renderContextFenceValue, D3D12_FENCE_FLAG_NONE, IID_PPV_ARGS(&m_renderContextFence)));
            m_renderContextFenceValue++;

            m_renderContextFenceEvent = CreateEvent(nullptr, FALSE, FALSE, nullptr);
            if (m_renderContextFenceEvent == nullptr) {
                DX::ThrowIfFailed(HRESULT_FROM_WIN32(GetLastError()));
            }

            // Add a signal command to the queue.
            DX::ThrowIfFailed(m_commandQueue->Signal(m_renderContextFence.Get(), m_renderContextFenceValue));

            // Instruct the fence to set the event object when the signal command completes.
            DX::ThrowIfFailed(m_renderContextFence->SetEventOnCompletion(m_renderContextFenceValue, m_renderContextFenceEvent));
            m_renderContextFenceValue++;

            // Wait until the signal command has been processed.
            WaitForSingleObject(m_renderContextFenceEvent, INFINITE);
        }
    } // Load the sample assets.

    std::cout << "Creating compute resources" << std::endl;
    {
        // Create compute resources.
        D3D12_COMMAND_QUEUE_DESC queueDesc = { D3D12_COMMAND_LIST_TYPE_COMPUTE, 0, D3D12_COMMAND_QUEUE_FLAG_NONE };
        DX::ThrowIfFailed(m_device->CreateCommandQueue(&queueDesc, IID_PPV_ARGS(&m_computeCommandQueue)));
        DX::ThrowIfFailed(m_device->CreateCommandAllocator(D3D12_COMMAND_LIST_TYPE_COMPUTE, IID_PPV_ARGS(&m_computeAllocator)));
        DX::ThrowIfFailed(m_device->CreateCommandList(0, D3D12_COMMAND_LIST_TYPE_COMPUTE, m_computeAllocator.Get(), nullptr, IID_PPV_ARGS(&m_computeCommandList)));
        DX::ThrowIfFailed(m_device->CreateFence(0, D3D12_FENCE_FLAG_SHARED, IID_PPV_ARGS(&m_Fence)));

        m_FenceEvent = CreateEvent(nullptr, FALSE, FALSE, nullptr);
        if (m_FenceEvent == nullptr) {
            DX::ThrowIfFailed(HRESULT_FROM_WIN32(GetLastError()));
        }
    }

    std::cout << "Calculating" << std::endl;
    Calculate();
    std::cout << "Finished" << std::endl;
}


void myClass::Calculate() {
    m_computeCommandList.Get()->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(m_dataBuffer.Get(), D3D12_RESOURCE_STATE_COMMON, D3D12_RESOURCE_STATE_UNORDERED_ACCESS));

    m_computeCommandList.Get()->SetPipelineState(m_computeState.Get());
    m_computeCommandList.Get()->SetComputeRootSignature(m_computeRootSignature.Get());

    ID3D12DescriptorHeap* ppHeaps[] = { m_UavHeap.Get() };
    m_computeCommandList.Get()->SetDescriptorHeaps(_countof(ppHeaps), ppHeaps);

    CD3DX12_GPU_DESCRIPTOR_HANDLE uavHandle(m_UavHeap->GetGPUDescriptorHandleForHeapStart(), Uav, m_UavDescriptorSize);
    
    m_computeCommandList.Get()->SetComputeRootUnorderedAccessView(ComputeRootUAVTable, m_dataBuffer->GetGPUVirtualAddress());

    m_computeCommandList.Get()->Dispatch(1, 1, 1);
    
    m_computeCommandList.Get()->ResourceBarrier(1, &CD3DX12_RESOURCE_BARRIER::Transition(m_dataBuffer.Get(), D3D12_RESOURCE_STATE_UNORDERED_ACCESS, D3D12_RESOURCE_STATE_COMMON));


    // Close and execute the command list.
    DX::ThrowIfFailed(m_computeCommandList.Get()->Close());
    ID3D12CommandList* commandLists[] = { m_computeCommandList.Get() };
    m_computeCommandQueue->ExecuteCommandLists(1, commandLists);

    // Wait for the compute shader to complete the calculation.
    UINT64 FenceValue = InterlockedIncrement(&m_FenceValue);
    DX::ThrowIfFailed(m_computeCommandQueue.Get()->Signal(m_Fence.Get(), FenceValue));
    DX::ThrowIfFailed(m_Fence.Get()->SetEventOnCompletion(FenceValue, m_FenceEvent));
    WaitForSingleObject(m_FenceEvent, INFINITE);

    std::cout << "FenceValue = " << FenceValue << " " << m_FenceValue << " " << m_Fence.Get()->GetCompletedValue() << std::endl;

    // Check the output!
    float* dataptr = nullptr;
    D3D12_RANGE range = { 0, readbackBufferDesc.Width };
    DX::ThrowIfFailed(m_dataBufferReadback->Map(0, &range, (void**)&dataptr));
    for (int i = 0; i < readbackBufferDesc.Width / sizeof(data[0]); i++)
        printf("uav[%d] = %.2f\n", i, dataptr[i]);
    m_dataBufferReadback->Unmap(0, nullptr);
    for (unsigned int i = 0; i < data.size(); i++) {
        std::cout << "data[" << i << "] = " << data[i] << std::endl;
    }
    
}

myClass.h:

#pragma once
#include "Common\Device.h"
#include <iostream>

// We have to write all of this as its own class, otherwise we cannot
// use the "this" pointer when we create compute resources. We need to
// do that because this code tagets multithreading.

class myClass {
public:
    myClass();

private:
    // Two buffers full of data are used. The compute thread alternates 
    // writing to each of them. The render thread renders using the 
    // buffer that is not currently in use by the compute shader.
    //struct Data {
    //  float c;
    //};
    //std::vector<Data> data;
    std::vector<float> data;

    // For the compute pipeline, the CBV is a struct containing some 
    // constants used in the compute shader.
    struct ConstantBufferCS {
        float a;
        float b;
    };

    D3D12_SUBRESOURCE_DATA dataSubResource;

    static const UINT FrameCount = 1;
    //static const UINT ThreadCount = 1;

    UINT m_heightInstances;
    UINT m_widthInstances;

    UINT m_frameIndex;
    Microsoft::WRL::ComPtr<ID3D12RootSignature> m_rootSignature;
    Microsoft::WRL::ComPtr<ID3D12RootSignature> m_computeRootSignature;
    Microsoft::WRL::ComPtr<ID3D12CommandQueue> m_commandQueue;
    Microsoft::WRL::ComPtr<ID3D12DescriptorHeap> m_UavHeap;
    Microsoft::WRL::ComPtr<ID3D12CommandAllocator> m_commandAllocators[FrameCount];
    Microsoft::WRL::ComPtr<ID3D12PipelineState> m_computeState;
    Microsoft::WRL::ComPtr<ID3D12GraphicsCommandList> m_commandList;
    Microsoft::WRL::ComPtr<ID3D12Resource> m_constantBufferCS;
    UINT64 m_renderContextFenceValue;
    HANDLE m_renderContextFenceEvent;
    UINT64 m_frameFenceValues[FrameCount];
    UINT m_UavDescriptorSize;

    ConstantBufferCS constantBufferCS;
    Microsoft::WRL::ComPtr<ID3D12Resource> constantBufferCSUpload;

    Microsoft::WRL::ComPtr<ID3D12Fence> m_renderContextFence;

    Microsoft::WRL::ComPtr<ID3D12Resource> m_dataBuffer;
    Microsoft::WRL::ComPtr<ID3D12Resource> m_dataBufferUpload;
    Microsoft::WRL::ComPtr<ID3D12Resource> m_dataBufferReadback;

    // Compute objects.
    Microsoft::WRL::ComPtr<ID3D12CommandAllocator> m_computeAllocator;
    Microsoft::WRL::ComPtr<ID3D12CommandQueue> m_computeCommandQueue;
    Microsoft::WRL::ComPtr<ID3D12GraphicsCommandList> m_computeCommandList;
    Microsoft::WRL::ComPtr<ID3D12Fence> m_Fence;
    volatile HANDLE m_FenceEvent;

    D3D12_RESOURCE_DESC readbackBufferDesc;

    // State
    UINT64 volatile m_FenceValue;
    /*
    struct ThreadData {
        myClass* pContext;
        UINT threadIndex;
    };
    ThreadData m_threadData;
    HANDLE m_threadHandles;
    */
    void Calculate();

    // Indices of shader resources in the descriptor heap.
    enum DescriptorHeapIndex : UINT32 {
        Uav = 0,
        DescriptorCount = 1
    };

    enum ComputeRootParameters : UINT32 {
        //ComputeRootCBV = 0,
        ComputeRootUAVTable = 0,
        ComputeRootParametersCount
    };

};

Device.cpp:

#pragma once
#include "pch.h"
#include "Device.h"
#include "DirectXHelper.h"
#include <cassert> // for "assert"
#include <iostream>

static Device* gs_pSingelton = nullptr;

// Constructor
Device::Device(HINSTANCE hInst, bool useWarp)
    : m_hInstance(hInst)
    , m_useWarp(useWarp)
{
}

void Device::Initialize() {
#if defined(_DEBUG)
    // Always enable the debug layer before doing anything DX12 related
    // so all possible errors generated while creating DX12 objects
    // are caught by the debug layer.
    Microsoft::WRL::ComPtr<ID3D12Debug1> debugInterface;
    DX::ThrowIfFailed(D3D12GetDebugInterface(IID_PPV_ARGS(&debugInterface)));
    debugInterface->EnableDebugLayer();
    // Enable these if you want full validation (will slow down rendering a lot).
    //debugInterface->SetEnableGPUBasedValidation(TRUE);
    //debugInterface->SetEnableSynchronizedCommandQueueValidation(TRUE);
#endif
    auto dxgiAdapter = GetAdapter(false);
    if (!dxgiAdapter) { // If no supporting DX12 adapters exist, fall back to WARP
        dxgiAdapter = GetAdapter(true);
    }
    if (dxgiAdapter) {
        m_device = CreateDevice(dxgiAdapter);
    }
    else {
        throw std::exception("DXGI adapter enumeration failed.");
    }
}

void Device::Create(HINSTANCE hInst) {
    if (!gs_pSingelton) {
        gs_pSingelton = new Device(hInst);
        gs_pSingelton->Initialize();
    }
}

Device& Device::Get() {
    assert(gs_pSingelton);
    return *gs_pSingelton;
}

void Device::Destroy() {
    if (gs_pSingelton) {
        delete gs_pSingelton;
        gs_pSingelton = nullptr;
    }
}

// Destructor
Device::~Device() {
}


Microsoft::WRL::ComPtr<ID3D12Device2> Device::CreateDevice(Microsoft::WRL::ComPtr<IDXGIAdapter4> adapter) {
    Microsoft::WRL::ComPtr<ID3D12Device2> d3d12Device2;
    DX::ThrowIfFailed(D3D12CreateDevice(adapter.Get(), D3D_FEATURE_LEVEL_11_0, IID_PPV_ARGS(&d3d12Device2)));

    // Enable debug messages in debug mode.
#if defined(_DEBUG)
    Microsoft::WRL::ComPtr<ID3D12InfoQueue> pInfoQueue;
    if (SUCCEEDED(d3d12Device2.As(&pInfoQueue))) {
        pInfoQueue->SetBreakOnSeverity(D3D12_MESSAGE_SEVERITY_CORRUPTION, TRUE);
        pInfoQueue->SetBreakOnSeverity(D3D12_MESSAGE_SEVERITY_ERROR, TRUE);
        pInfoQueue->SetBreakOnSeverity(D3D12_MESSAGE_SEVERITY_WARNING, TRUE);
        // Suppress whole categories of messages
        //D3D12_MESSAGE_CATEGORY Categories[] = {};

        // Suppress messages based on their severity level
        D3D12_MESSAGE_SEVERITY Severities[] = { D3D12_MESSAGE_SEVERITY_INFO };

        // Suppress individual messages by their ID
        D3D12_MESSAGE_ID DenyIds[] = {
            D3D12_MESSAGE_ID_CLEARRENDERTARGETVIEW_MISMATCHINGCLEARVALUE,   // I'm really not sure how to avoid this message.
            D3D12_MESSAGE_ID_MAP_INVALID_NULLRANGE,                         // This warning occurs when using capture frame while graphics debugging.
            D3D12_MESSAGE_ID_UNMAP_INVALID_NULLRANGE,                       // This warning occurs when using capture frame while graphics debugging.
        };

        D3D12_INFO_QUEUE_FILTER NewFilter = {};
        //NewFilter.DenyList.NumCategories = _countof(Categories);
        //NewFilter.DenyList.pCategoryList = Categories;
        NewFilter.DenyList.NumSeverities = _countof(Severities);
        NewFilter.DenyList.pSeverityList = Severities;
        NewFilter.DenyList.NumIDs = _countof(DenyIds);
        NewFilter.DenyList.pIDList = DenyIds;

        DX::ThrowIfFailed(pInfoQueue->PushStorageFilter(&NewFilter));
    }
#endif
    return d3d12Device2;
}

Microsoft::WRL::ComPtr<IDXGIAdapter4> Device::GetAdapter(bool useWarp) {
    UINT createFactoryFlags = 0;
#if defined(_DEBUG)
    createFactoryFlags = DXGI_CREATE_FACTORY_DEBUG;
#endif

    DX::ThrowIfFailed(CreateDXGIFactory2(createFactoryFlags, IID_PPV_ARGS(&m_factory)));

    Microsoft::WRL::ComPtr<IDXGIAdapter1> dxgiAdapter1;
    Microsoft::WRL::ComPtr<IDXGIAdapter4> dxgiAdapter4;

    if (useWarp) {
        DX::ThrowIfFailed(m_factory->EnumWarpAdapter(IID_PPV_ARGS(&dxgiAdapter1)));
        DX::ThrowIfFailed(dxgiAdapter1.As(&dxgiAdapter4));
    }
    else {
        SIZE_T maxDedicatedVideoMemory = 0;
        for (UINT i = 0; m_factory->EnumAdapters1(i, &dxgiAdapter1) != DXGI_ERROR_NOT_FOUND; ++i) {
            DXGI_ADAPTER_DESC1 dxgiAdapterDesc1;
            dxgiAdapter1->GetDesc1(&dxgiAdapterDesc1);

            // Check to see if the adapter can create a D3D12 device without actually 
            // creating it. The adapter with the largest dedicated video memory
            // is favored.
            if ((dxgiAdapterDesc1.Flags & DXGI_ADAPTER_FLAG_SOFTWARE) == 0 &&
                SUCCEEDED(D3D12CreateDevice(dxgiAdapter1.Get(),
                    D3D_FEATURE_LEVEL_11_0, __uuidof(ID3D12Device), nullptr)) &&
                dxgiAdapterDesc1.DedicatedVideoMemory > maxDedicatedVideoMemory) {
                maxDedicatedVideoMemory = dxgiAdapterDesc1.DedicatedVideoMemory;
                DX::ThrowIfFailed(dxgiAdapter1.As(&dxgiAdapter4));
            }
        }
    }

    return dxgiAdapter4;
}

Device.h:

#pragma once

#include <dxgi1_6.h> // IDXGIAdapter4

// We require this file because we are unable to pass the device pointer to everywhere we need to.

class Device {
public:
    /**
    * Create the device singleton with the device instance handle.
    */
    static void Create(HINSTANCE hInst);

    /**
    * Destroy the device instance.
    */
    static void Destroy();

    /**
    * Get the device singleton.
    */
    static Device& Get();

    /**
     * Get the Direct3D 12 device
     */
    Microsoft::WRL::ComPtr<ID3D12Device2> GetDevice() const { return m_device; }
    Microsoft::WRL::ComPtr<IDXGIFactory4> GetFactory() const { return m_factory; }

protected:
    // Create a device instance
    Device(HINSTANCE hInst, bool useWarp = false);
    // Destroy the device instance.
    virtual ~Device();

    // Initialize the device instance.
    void Initialize();
    Microsoft::WRL::ComPtr<IDXGIAdapter4> GetAdapter(bool useWarp);
    Microsoft::WRL::ComPtr<ID3D12Device2> CreateDevice(Microsoft::WRL::ComPtr<IDXGIAdapter4> adapter);
private:
    Device(const Device& copy) = delete;
    Device& operator=(const Device& other) = delete;
    HINSTANCE m_hInstance;
    Microsoft::WRL::ComPtr<ID3D12Device2> m_device;
    Microsoft::WRL::ComPtr<IDXGIFactory4> m_factory;
    bool m_useWarp;
};

ComputeShader.hlsl:

RWStructuredBuffer<float> output : register(u0);    // UAV

[numthreads(1, 1, 1)]
void main( uint3 DTid : SV_DispatchThreadID ) {
    output[DTid.x] = 1 + 1;
}

Please let me know if you are able to find what I do not understand. I can also try uploading my project to GitHub if it helps... SOS :(

boof
  • 339
  • 1
  • 3
  • 13
  • Copying code which you don't understand, doing some "small" fixes and expecting others to fix "your" code is not really the reason why stackoverflow exists. By the way, I'm pretty sure there is an API-documentation for DirectX12. I suggest you read it. Otherwise contact Microsoft for support, via a developer account. – paladin May 07 '21 at 06:04
  • You would probably have better luck starting from [this sample](https://github.com/microsoft/Xbox-ATG-Samples/tree/master/PCSamples/IntroGraphics/SimpleComputePC12). – Chuck Walbourn May 07 '21 at 07:54
  • @paladin I mean, I'm not a complete idiot. I checked the code on every edit and carefully read through tutorials (and yes, the API included) for weeks before even starting. If you had looked at my code at all you would notice that I might has well have written it myself. I'm doing my best here, but the severe lack of resources on this topic makes things very difficult. – boof May 07 '21 at 10:23
  • Thanks, @ChuckWalbourn, I took a look at that sample. The reason why I started from the NBody code is because it seemed simpler to me than the one you posted. I really don't need to build a game world, draw Mandelbrot sets, have user inputs on keyboard and mouse, or anything else. I just want to do 1+1. I would try to do "Hello world" instead but I know that GPUs can't speak. – boof May 07 '21 at 10:30
  • 1
    There's a DirectX 11 sample called [BasicCompute](https://github.com/walbourn/directx-sdk-samples/tree/master/BasicCompute11) that's just a console app. – Chuck Walbourn May 07 '21 at 20:57
  • @ChuckWalbourn This looks quite helpful! Should I be concerned that it is written for DirectX11 rather than DirectX12, or will the concepts involved map over to DirectX12? – boof May 08 '21 at 00:57
  • 1
    DX11 & DX12 are basically the same hardware. Unless you specifically need Shader Model 6+ features, there's no difference if you don't care about any interactions with graphics operations. – Chuck Walbourn May 08 '21 at 02:03

0 Answers0