3

For the past couple of hours I've been trying to track down a bug in my program, which only occurs when running it in release mode. I've already resolved all level-4 compiler-warnings, and there are no uninitialized variables anywhere (Which would usually be my first suspect in a case like this).

This is a tough one to explain, since I don't even exactly know what exactly is going on, so bear with me please.

After a lot of debugging, I've narrowed the cause of the bug down to somewhere in the following function:

void CModelSubMesh::Update()
{
    ModelSubMesh::Update();

    auto bHasAlphas = (GetAlphaCount() > 0) ? true : false;
    auto bAnimated = (!m_vertexWeights.empty() || !m_weightBoneIDs.empty()) ? true : false;
    if(bHasAlphas == false && bAnimated == false)
        m_glMeshData = std::make_unique<GLMeshData>(m_vertices,m_normals,m_uvs,m_triangles);
    else
    {
        m_glmesh = GLMesh();
        auto bufVertex = OpenGL::GenerateBuffer();
        auto bufUV = OpenGL::GenerateBuffer();
        auto bufNormal = OpenGL::GenerateBuffer();
        auto bufIndices = OpenGL::GenerateBuffer();
        auto bufAlphas = 0;
        if(bHasAlphas == true)
            bufAlphas = OpenGL::GenerateBuffer();

        auto vao = OpenGL::GenerateVertexArray();

        m_glmesh.SetVertexArrayObject(vao);
        m_glmesh.SetVertexBuffer(bufVertex);
        m_glmesh.SetUVBuffer(bufUV);
        m_glmesh.SetNormalBuffer(bufNormal);
        if(bHasAlphas == true)
            m_glmesh.SetAlphaBuffer(bufAlphas);
        m_glmesh.SetIndexBuffer(bufIndices);
        m_glmesh.SetVertexCount(CUInt32(m_vertices.size()));
        auto numTriangles = CUInt32(m_triangles.size()); // CUInt32 is equivalent to static_cast<unsigned int>
        m_glmesh.SetTriangleCount(numTriangles);
        // PLACEHOLDER LINE

        OpenGL::BindVertexArray(vao);

        OpenGL::BindBuffer(bufVertex,GL_ARRAY_BUFFER);
        OpenGL::BindBufferData(CInt32(m_vertices.size()) *sizeof(glm::vec3),&m_vertices[0],GL_STATIC_DRAW,GL_ARRAY_BUFFER);

        OpenGL::EnableVertexAttribArray(SHADER_VERTEX_BUFFER_LOCATION);
        OpenGL::SetVertexAttribData(
            SHADER_VERTEX_BUFFER_LOCATION,
            3,
            GL_FLOAT,
            GL_FALSE,
            (void*)0
        );

        OpenGL::BindBuffer(bufUV,GL_ARRAY_BUFFER);
        OpenGL::BindBufferData(CInt32(m_uvs.size()) *sizeof(glm::vec2),&m_uvs[0],GL_STATIC_DRAW,GL_ARRAY_BUFFER);

        OpenGL::EnableVertexAttribArray(SHADER_UV_BUFFER_LOCATION);
        OpenGL::SetVertexAttribData(
            SHADER_UV_BUFFER_LOCATION,
            2,
            GL_FLOAT,
            GL_FALSE,
            (void*)0
        );

        OpenGL::BindBuffer(bufNormal,GL_ARRAY_BUFFER);
        OpenGL::BindBufferData(CInt32(m_normals.size()) *sizeof(glm::vec3),&m_normals[0],GL_STATIC_DRAW,GL_ARRAY_BUFFER);

        OpenGL::EnableVertexAttribArray(SHADER_NORMAL_BUFFER_LOCATION);
        OpenGL::SetVertexAttribData(
            SHADER_NORMAL_BUFFER_LOCATION,
            3,
            GL_FLOAT,
            GL_FALSE,
            (void*)0
        );

        if(!m_vertexWeights.empty())
        {
            m_bufVertWeights.bufWeights = OpenGL::GenerateBuffer();
            OpenGL::BindBuffer(m_bufVertWeights.bufWeights,GL_ARRAY_BUFFER);
            OpenGL::BindBufferData(CInt32(m_vertexWeights.size()) *sizeof(float),&m_vertexWeights[0],GL_STATIC_DRAW,GL_ARRAY_BUFFER);

            OpenGL::EnableVertexAttribArray(SHADER_BONE_WEIGHT_LOCATION);
            OpenGL::BindBuffer(m_bufVertWeights.bufWeights,GL_ARRAY_BUFFER);
            OpenGL::SetVertexAttribData(
                SHADER_BONE_WEIGHT_LOCATION,
                4,
                GL_FLOAT,
                GL_FALSE,
                (void*)0
            );
        }
        if(!m_weightBoneIDs.empty())
        {
            m_bufVertWeights.bufBoneIDs = OpenGL::GenerateBuffer();
            OpenGL::BindBuffer(m_bufVertWeights.bufBoneIDs,GL_ARRAY_BUFFER);
            OpenGL::BindBufferData(CInt32(m_weightBoneIDs.size()) *sizeof(int),&m_weightBoneIDs[0],GL_STATIC_DRAW,GL_ARRAY_BUFFER);

            OpenGL::EnableVertexAttribArray(SHADER_BONE_WEIGHT_ID_LOCATION);
            OpenGL::BindBuffer(m_bufVertWeights.bufBoneIDs,GL_ARRAY_BUFFER);
            glVertexAttribIPointer(
                SHADER_BONE_WEIGHT_ID_LOCATION,
                4,
                GL_INT,
                0,
                (void*)0
            );
        }

        if(bHasAlphas == true)
        {
            OpenGL::BindBuffer(bufAlphas,GL_ARRAY_BUFFER);
            OpenGL::BindBufferData(CInt32(m_alphas.size()) *sizeof(glm::vec2),&m_alphas[0],GL_STATIC_DRAW,GL_ARRAY_BUFFER);

            OpenGL::EnableVertexAttribArray(SHADER_USER_BUFFER1_LOCATION);
            OpenGL::SetVertexAttribData(
                SHADER_USER_BUFFER1_LOCATION,
                2,
                GL_FLOAT,
                GL_FALSE,
                (void*)0
            );
        }
        OpenGL::BindBuffer(bufIndices,GL_ELEMENT_ARRAY_BUFFER);
        OpenGL::BindBufferData(numTriangles *sizeof(unsigned int),&m_triangles[0],GL_STATIC_DRAW,GL_ELEMENT_ARRAY_BUFFER);

        OpenGL::BindVertexArray(0);
        OpenGL::BindBuffer(0,GL_ARRAY_BUFFER);
        OpenGL::BindBuffer(0,GL_ELEMENT_ARRAY_BUFFER);

    }
    ComputeTangentBasis(m_vertices,m_uvs,m_normals,m_triangles);
}

My program is a graphics application, and this piece of code generates the object buffers which are required for rendering later on. The bug basically causes the vertices of a specific mesh to be rendered incorrectly when certain conditions are met. The bug is consistent and happens every time for the same mesh.

Sadly I can't narrow the code down any further, since that would make the bug disappear, and explaining what each line does would take quite a while and isn't too relevant here. I'm almost positive that this is a problem with compiler optimization, so the actual bug is more of a side-effect in this case anyway.

With the code above, the bug will occur, but only when in release mode. The interesting part is the line I marked as "PLACEHOLDER LINE".

If I change the code to one of the following 3 variants, the bug will disappear:

#1:

void CModelSubMesh::Update()
{
    [...]
    // PLACEHOLDER LINE
    std::cout<<numTriangles<<std::endl;
    [...]
}

#2:

#pragma optimize( "", off )
void CModelSubMesh::Update()
{
    [...] // No changes to the code
}
#pragma optimize( "", on ) 

#3:

static void test()
{
    auto *f = new float; // Do something to make sure the compiler doesn't optimize this function away; Doesn't matter what
    delete f;
}
void CModelSubMesh::Update()
{
    [...]
    // PLACEHOLDER LINE
    test()
    [...]
}

Especially variant #2 indicates that something is being optimized which shouldn't be.

I don't expect anyone to magically know what the root of the problem is, since that would require deeper knowledge of the code. However, maybe someone with a better understanding of the compiler optimization process can give me some hints, what could be going on here?

Since almost any change to the code gets rid of the bug, I'm just not sure what I can do to actually find the cause of it.

Silverlan
  • 2,783
  • 3
  • 31
  • 66
  • 4
    I'm 99.9(9)% certain that compiler is NOT at fault here. Errors that only occur in release are usually memory-related or multithreading-related. And the bug doesn't disappear when you make the code alterations that you've described, it just becomes hidden. I. e. you don't notice it, but change something else and it will become exposed again. – Violet Giraffe Jan 17 '16 at 10:59
  • 1
    If it's memory corruption, it can be anywhere in your application and not in the code fragment that you've posted here. – Violet Giraffe Jan 17 '16 at 11:07
  • If it's memory corruption, it doesn't explain why it works with the 3 variants I've supplied. Without one of the variants, the bug occurs 10 out of 10 times. With one of the variants in place, it occurs 0 out of 10 times. – Silverlan Jan 17 '16 at 11:10
  • 1
    You're wrong, it explains everything. I suppose you don' have much experience with memory corruption issues... Yet. When you alter the code or move it around, you change how the code and data is layed out in memory, so the corrupted piece of memory is also different, leading to different symptoms (or lack thereof). – Violet Giraffe Jan 17 '16 at 11:20
  • 1
    I would suggest that you try to simplify your code until it both shows the problem and is short enough to fit as a question here. I agree with Violet Giraffe that it's highly likely "something else". (Removing unused bits of the code, for example the `bHasAlpha`, seems like a good start) – Mats Petersson Jan 17 '16 at 12:23
  • Problems that disappear when you add arbitrary unrelated things is a strong indicator of undefined behaviour. (He said, memories of passing a uni assignment by "mistakenly" not removing debug printouts rushing through his mind.) – molbdnilo Jan 17 '16 at 12:24
  • what he said, compiler optimizations CANNOT cause problems, they can TRIGGER those which already exist though. Your code looks like it has been created under time pressure, you're jumping between scopes like there is no tomorrow, you're doing no nullpointer-checks AT ALL, you're actually using `auto` and you're concatenating logical expressions like if there was no alternative ... i think you need to rewrite your entire code, the above example is full of beginners' mistakes ... sorry but its like that – specializt Jan 18 '16 at 04:58
  • Some tools you may want to try: valgrind (not for memory leaks, but it also reports memory access violations), compile with sanitizers (address sanitizer finds memory access violations, threading sanitizer race conditions and UB sanitizer finds UB). I guess it's either a memory access violation or an uninitialized variable. – Jens Jan 18 '16 at 07:52

1 Answers1

3

Most often when I've hit something that works in debug but not in release it's an uninitialized variable. Most compilers initialize variables to 0x00 in debug builds, but you lose that when optimizations are turned on.

This could explain why modifying the program alters the behavior: by adjusting the memory map of you application you end up getting some random different chunk of uninitialized memory that somehow masks the issue.

If you're keeping up good memory management hygiene you might find the issue quickly using a tool like valgrind. Long term you may want to look into leveraging a memory management framework that detects memory abuse automagically, (see Ogre MemoryTracker, TCMalloc, Clang Memory Sanitizer).

moof2k
  • 1,678
  • 1
  • 17
  • 19