Understanding DirectX pipeline optimisations

Question

I'm trying to improve some simple DirectX rendering code I've implemented. My idea is to only update the rendering pipeline when absolutely necessary because my understanding is it is beneficial to minimise the number of pipeline modifications wherever possible. What I mean by this is demonstrated in the following pseudocode:

ID3D11VertexShader *t_shader = getVertexShader();
ID3D11DeviceContext->VSSetShader(t_shader, nullptr, 0);
// Do some other processing/pipeline setup without modifying t_shader
ID3D11DeviceContext->VSSetShader(t_shader, nullptr, 0);
ID3D11DeviceContext->Draw(10, 0);

This is inefficient because we're calling VSSetShader twice when the shader hasn't changed. This is an over simplification but hopefully you get where I'm coming from, my basic understanding is these type of unnecessary binds/calls are inefficient?

If this is the case then is it possible to make the below optimisation between two separate ID3D11DeviceContext::Draw calls? (again pseudocode so please forgive the missing steps and assume all we need to do is set a vertex & pixel shader along with a topology before we draw):

void Object1::Draw() {
    ID3D11VertexShader *t_vs = ShaderMgr::vertexShader1();
    ID3D11DeviceContext->VSSetShader(t_vs, nullptr, 0);

    ID3D11PixelShader *t_ps = ShaderMgr::pixelShader1();
    ID3D11DeviceContext->PSSetShader(t_ps, nullptr, 0);

    ID3D11DeviceContext->IASetPrimitiveTopology(ID3D11_PRIMITIVE_TOPOLOGY_LINELIST);
    ID3D11DeviceContext->Draw(m_vertexCount, 0);
}

void Object2::Draw() {
    ID3D11VertexShader *t_vs = ShaderMgr::vertexShader1();
    ID3D11DeviceContext->VSSetShader(t_vs, nullptr, 0);

    // Use a different pixel shader to Object1
    ID3D11PixelShader *t_ps = ShaderMgr::pixelShader2();
    ID3D11DeviceContext->PSSetShader(t_ps, nullptr, 0);

    ID3D11DeviceContext->IASetPrimitiveTopology(ID3D11_PRIMITIVE_TOPOLOGY_LINELIST);
    ID3D11DeviceContext->Draw(m_vertexCount, 0);
}

The only difference between the two draw calls is the use of a different pixel shader. So is the following a possible optimisation or does each draw call effectively reset the pipeline?

void Object1::Draw() {
    // Removed common set code
    ID3D11PixelShader *t_ps = ShaderMgr::pixelShader1();
    ID3D11DeviceContext->PSSetShader(t_ps, nullptr, 0);
    ID3D11DeviceContext->Draw(m_vertexCount, 0);
}

void Object2::Draw() {   
    // Removed common set code
    ID3D11PixelShader *t_ps = ShaderMgr::pixelShader2();
    ID3D11DeviceContext->PSSetShader(t_ps, nullptr, 0);
    ID3D11DeviceContext->Draw(m_vertexCount, 0);
}

void drawObjects() {
    // Common states amongst object1 and object2
    ID3D11VertexShader *t_vs = ShaderMgr::vertexShader1();
    ID3D11DeviceContext->VSSetShader(t_vs, nullptr, 0);
    ID3D11DeviceContext->IASetPrimitiveTopology(ID3D11_PRIMITIVE_TOPOLOGY_LINELIST);

    m_object1->draw();

    // Don't bother setting the vs or topology here

    m_object2->draw();
}

Any feedback/info would be much appreciated.

You can just encapsulate the device within a class that does state caching to catch redundancy. usually, perf gain, unless something really bad is done are minimal when you get rid of these calls anyway as driver usually cache and test for unneeded state changes already. — galop1n, Mar 08 '17 at 19:14
This is what I'm trying to implement at the moment, designing a Pipeline class which only changes state when necessary, the issue I'm seeing at the minute though is after an initial draw call, the pipeline's "state" seems to have been reset and I have to set everything up again, e.g. InputTopology, vertex buffers etc. — TheRarebit, Mar 09 '17 at 09:41
You could take a look at how DirectX12 encapsulates pipeline state (Pipeline State Objects). I'm not suggesting you actually use DirectX 12 unless you are already an expert users of DirectX 11, but the design reflects the preferences of modern GPU hardware. — Chuck Walbourn, Mar 09 '17 at 18:04
The trick is to balance between minimizing state changes versus spending countless hours debugging invalid states. I did write this answer a little while ago, might give some insights. http://stackoverflow.com/questions/20300778/are-there-directx-guidelines-for-binding-and-unbinding-resources-between-draw-ca/24106985#24106985 . — mrvux, Mar 11 '17 at 23:22
Cheers for the feedback guys, seems like its a bit of a balancing act and that I may have a bug I need to look at as it seems the pipeline is getting in to an invalid state. Just to clarify though, with the original example...when the VS is set with the call VSSetShader before object1->draw(), after the draw call is finished that VS is still bound to the pipeline until I call VSSetShader again right? ID3D11DeviceContext->Draw won't reset that? — TheRarebit, Mar 13 '17 at 10:09
@TheRarebit this is correct. Draw does not reset Shader pipeline — mrvux, Mar 13 '17 at 13:41

score 0 · Accepted Answer · answered Mar 13 '17 at 12:04

just posting an answer to my own question as I've spotted a bug in my test code which was clouding the issue and hopefully this'll help anyone else who sees.

My confusion was to with the fact that in my test code I only had a single object I was rendering, a simple plane. The only resources it used were a vertex buffer, vertex shader and pixel shader. I tried adding the optimisations mentioned above to try and reduce the number of ID3D11DeviceContext calls as much as possible. For this simple object it seemed sensible to me that the ID3D11DeviceContext calls, e.g. VSSetShader, PSSetShader etc., should only need to be called once because this was the only object was being renderered. However this was not the case as once the grid had been rendered once it disappeared and never got rendered again.

With the help of RenderDoc I was able to capture a rendered frame and noticed that there were two draw calls being made when I was only expecting one. I had forgotten that I had a SpriteFont and SpriteBatch class created through DIrectXTK being used to write out my camera position for debugging. This call was modifying the pipeline state while bypassing my pipeline class (which was controlling these optimisations) without me realising. This meant when the grid was rendered for a second time, the pipeline was in the incorrect state.

So it turns out these optimisations are possible and that the pipeline isn't cleared as the result of a draw call. So if you have something like the above example then its enough to call the context calls once between calls. I've also learnt that a debugging tool like RenderDoc or Visual Studios built in rendering debugger is vital to tracking these type of issues down.

Understanding DirectX pipeline optimisations

1 Answers1