1

The exact versions of g++ and valgrind:

g++-5 (Ubuntu 5.2.1-23ubuntu1~12.04) 5.2.1 20151031
valgrind-3.7.0

I didn't dive into which flag exactly does this (finline-small-functions/findirect-inlining/finline-functions/finline-functions-called-once/fearly-inlining) because I am testing this remotely on travis and I am already annoyed from waiting so I just used -fno-inline (I don't have a working linux on my machine).

Actually I had no idea that this was due to inlining and wanted valgrind to report the real function that caused the error so disabled inlining... and voila!

Note that this happens only with g++5 - I've tested g++ 4.4/4.5/4.7/4.8/4.9 (4.6 not tested) and also clang++ 3.4/3.5/3.6/3.7/3.8 (and also all these compilers under OSX too)

Here is the error:

==3063== 1 errors in context 1 of 1:
==3063== Invalid read of size 4
==3063==    at 0x40092E: regTest(char const*, char const*) (a.cpp:17)

This is my code:

// required includes
#include <cstdio>  // printf and friends
#include <cstdlib> // malloc, free, qsort
#include <cstring> // strlen, strcpy, strtok
#include <new>     // placement new

struct String
{
    char* m_str;

    void copy(const String& other) {
        if(m_str)
            free(m_str);
        m_str = 0;

        if(other.m_str) {
            m_str = static_cast<char*>(malloc(strlen(other.m_str) + 1));
            strcpy(m_str, other.m_str);
        }
    }

    String(const char* in = 0)
            : m_str(0) {
        if(in == 0)
            return;

        m_str = static_cast<char*>(malloc(strlen(in) + 1));
        strcpy(m_str, in);
    }

    String(const String& other)
            : m_str(0) {
        copy(other);
    }

    ~String() {
        if(m_str)
            free(m_str);
    }

    String& operator=(const String& other) {
        if(this != &other)
            copy(other);
        return *this;
    }
};

template <class T>
class Vector
{
    unsigned m_size;
    unsigned m_capacity;
    T*       m_buffer;

public:
    Vector()
            : m_size(0)
            , m_capacity(0)
            , m_buffer(0) {}

    Vector(const Vector& other)
            : m_size(other.m_size)
            , m_capacity(other.m_capacity)
            , m_buffer(static_cast<T*>(malloc(sizeof(T) * m_capacity))) {
        for(unsigned i = 0; i < m_size; ++i)
            new(m_buffer + i) T(other.m_buffer[i]);
    }

    ~Vector() {
        for(unsigned i = 0; i < m_size; ++i)
            (*(m_buffer + i)).~T();
        free(m_buffer);
    }

    Vector& operator=(const Vector& other) {
        if(this != &other) {
            for(size_t i = 0; i < m_size; ++i)
                (*(m_buffer + i)).~T();
            free(m_buffer);

            m_size     = other.m_size;
            m_capacity = other.m_capacity;

            m_buffer = static_cast<T*>(malloc(sizeof(T) * m_capacity));
            for(unsigned i = 0; i < m_size; ++i)
                new(m_buffer + i) T(other.m_buffer[i]);
        }
        return *this;
    }

    unsigned size() const { return m_size; }

    void push_back(const T& item) {
        if(m_size < m_capacity) {
            new(m_buffer + m_size++) T(item);
        } else {
            if(m_capacity == 0)
                m_capacity = 5; // initial capacity
            else
                m_capacity *= 2; // capacity growth factor
            T* temp = static_cast<T*>(malloc(sizeof(T) * m_capacity));
            for(unsigned i = 0; i < m_size; ++i) {
                new(temp + i) T(m_buffer[i]);
                (*(m_buffer + i)).~T();
            }
            new(temp + m_size++) T(item);
            free(m_buffer);
            m_buffer = temp;
        }
    }
};

struct FunctionData
{
    String m_suite;
    String m_name;

    const char* m_file;

    FunctionData(const char* suite, const char* name, const char* file)
            : m_suite(suite)
            , m_name(name)
            , m_file(file) {}

    FunctionData(const FunctionData& other)
            : m_suite(other.m_suite)
            , m_name(other.m_name)
            , m_file(other.m_file) {}
};

const char*& getCurrentTestSuite() {
    static const char* data = 0;
    return data;
}

int setTestSuiteName(const char* name) {
    getCurrentTestSuite() = name;
    return 0;
}

int regTest(const char* file, const char* name) {
    Vector<FunctionData> temp;

    temp.push_back(FunctionData(getCurrentTestSuite(), name, file));

    // main() is empty and we dont want this optimized away
    printf("hello! %d\n", temp.size());

    return 0;
}

__attribute__((unused)) static int a1 = setTestSuiteName("current testsuite");
__attribute__((unused)) static int a2 = regTest("a.cpp", "zzz");

int main(int, char**) { return 0; }

This is how I run it:

g++-5 a.cpp -Wall -Wextra -pedantic -std=c++98 -g -O3 -fno-inline
valgrind --leak-check=full --track-origins=yes -v ./a.out
g++-5 a.cpp -Wall -Wextra -pedantic -std=c++98 -g -O3
valgrind --leak-check=full --track-origins=yes -v ./a.out

The second run leads to the valgrind error.

Removing any of the members of FunctionData stops reproducing the problem. Cutting Vector out of the picture also leads to no errors.

here is the repository and here is the travis log.

I've wasted more than a few hours on minimizing this so I'm done with minifying the reproduction code.

So who is wrong - g++5 or valgrind? or me? what should I do next? why might this be happening?

EDIT:

lol! just noticed that (a.cpp:17) on the error so the problematic line is m_str = static_cast<char*>(malloc(strlen(other.m_str) + 1)); - but WHY?!?!?! Even if everything gets inlined in regTest() - I don't think there is a real error in this simple code

EDIT 2:

Just tried with Ubuntu 14.04 locally with g++ (Ubuntu 5.3.0-3ubuntu1~14.04) 5.3.0 20151204 and valgrind-3.10.1 and the case is the same - when compiling with inlining the bug occurs.

Also tried locally with g++-4.8 (Ubuntu 4.8.5-2ubuntu1~14.04.1) 4.8.5 and OMG! also buggy like g++-5! maybe a patch went into g++ 4.8.5 that wasn't in 4.8.x and 4.9.x which were used in travis

EDIT 3:

adding __attribute__((noinline)) to the constructor (any - normal and copy - works with both) of the String class resolved the issue. So is this a bug? what to do next?

EDIT 4:

I played a bit more and changed the code to this (removed the Vector class) and managed to trigger an error from valgrind when compiliing with

g++ a.cpp -O3 -fno-elide-constructors 

and no error when compiling with just

g++ a.cpp -O3

(both cases with inlining ON)

Something is wrong here with these optimizations. Sorry for the many edits and long post - I will shut up now.

EDIT 5:

a friend told me to add -ggdb when compiling and now the error from valgrind for the original code is this:

==2150== Invalid read of size 4
==2150==    at 0x40095E: copy (a.cpp:17)
==2150==    by 0x40095E: String (a.cpp:33)
==2150==    by 0x40095E: FunctionData (a.cpp:128)
==2150==    by 0x40095E: push_back (a.cpp:106)
==2150==    by 0x40095E: regTest(char const*, char const*) (a.cpp:144)
==2150==    by 0x400B2C: __libc_csu_init (in /home/onqtam/a.out)
==2150==    by 0x537CE54: (below main) (libc-start.c:246)
==2150==  Address 0x5a37c90 is 16 bytes inside a block of size 18 alloc'd
==2150==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2150==    by 0x4008DF: String (a.cpp:27)
==2150==    by 0x4008DF: FunctionData (a.cpp:123)
==2150==    by 0x4008DF: regTest(char const*, char const*) (a.cpp:144)
==2150==    by 0x400B2C: __libc_csu_init (in /home/onqtam/a.out)
==2150==    by 0x537CE54: (below main) (libc-start.c:246)
Community
  • 1
  • 1
onqtam
  • 4,356
  • 2
  • 28
  • 50
  • The output indicates an error on line 17, but line 17 of what you have posted has nothing to do with `regTest`. Please ensure that the code you are running is the exact same code you have posted, character for character, and the valgrind output is exactly as it occurred too – M.M Feb 27 '16 at 15:12
  • @M.M but everything is inlined in regTest! thats why it reports the error there. regTest() invokes some String copy constructors... Just look at the repository and the corresponding builds in travis. And when I disable inlining to see where exactly the error happens - it disappears! I have >>> quatruple <<< checked this - maybe its time to get a linux just because of this bug... (or will offer a bounty when 2 days are up). – onqtam Feb 27 '16 at 19:49
  • Valgrind 3.7.0 is quite old now. Newer releases have many improvements (a.o. they understand function inlining). So, you could try with valgrind 3.11.0. – phd Feb 27 '16 at 21:00
  • @phd Just tried with Ubuntu 14.04 locally with ```g++ (Ubuntu 5.3.0-3ubuntu1~14.04) 5.3.0 20151204``` and ```valgrind-3.10.1``` and the situation is the same! – onqtam Feb 27 '16 at 21:27
  • @onqtam compiler error messages always reflect the line of source they were on, regardless of inlining or whatever – M.M Feb 27 '16 at 22:45
  • @M.M but this is a valgrind error message - are you 100% certain how it's error reporting is done (and what is present as debugging information in the binary - and in what form when function symbols totally erased from the binary after optimization but the lines of code in them are still executed) and do you have any other explanation of the valgrind error? This is reproducible behavior and this was my best guess as to why it reported line 17 for the ```regTest``` symbol. – onqtam Feb 27 '16 at 23:03
  • @M.M see **EDIT 5** - what happens when I add ```-ggdb``` – onqtam Feb 28 '16 at 00:15
  • Does the error go away if you change `"current testsuite"` to `"current testsuite x"` ? – M.M Feb 28 '16 at 00:20
  • @M.M yes! it did go away (the inlining case - havent tried the ```-fno-elide-constructors```). so this is 20 bytes in memory with ```'/0'```? – onqtam Feb 28 '16 at 00:23
  • I would say that your compiler is optimizing `strcpy` to read in 4-byte chunks, and valgrind is noticing that this reads past the end of an allocation sometimes. You should be able to confirm this by checking the generated assembly code corresponding to the `strcpy` call. It's probably not a bug, if the target architecture guarantees that this read won't trap – M.M Feb 28 '16 at 00:25
  • @M.M well my assembly-fu is not that good. anyway I still consider this a gcc problem because valgrind runs should be clean for such code. – onqtam Feb 28 '16 at 00:27

1 Answers1

1

This is because gcc optimises strcpy to operate on 4-byte blocks, which is always safe because you can't allocate a block of memory that isn't a multiple of 4 bytes (on x86 and x64 at least). So from gcc's point of view the read is definitely safe, but from valgrind's point of view you're reading past the end of what you said you'll allocate. Usually valgrind can detect that you're doing memcpy/memmove/strcpy/etc. and knows to suppress the error, but when the call is inlined its detection fails and you get the erroneous error message.

You might want to wrap strcpy in a call to alert valgrind to the fact that the following memory access is safe, i.e. see http://valgrind.org/docs/manual/mc-manual.html#mc-manual.clientreqs if you really want to debug with aggressive inlining enabled.

Moshev
  • 556
  • 6
  • 14