3

I am trying to sort a list (std::vector) of 3D integer vectors (IntVec). Somehow, std::sort causes a Segmentation Fault in the operator< of IntVec. Here is my code:

#include <iostream>
#include <algorithm>
#include <vector>
#include <fstream>

struct IntVec
{
public:
    long x;
    long y;
    long z; // Using ints does not cause the Segmentation Fault ?!

    friend bool operator<(const IntVec &lhs, const IntVec &rhs)
    {
        return (lhs.z < rhs.z) ||  // Segmentation Fault happens here
                    ((lhs.z == rhs.z) && (lhs.y < rhs.y))
                        || ((lhs.y == rhs.y) && (lhs.x < rhs.x));
    }
};

int main(void)
{
    std::vector<IntVec> vec;

    const int N = 2178;
    std::ifstream s("res.txt");
    for (int i = 0; i < N; i++)
    {
        IntVec t;
        s >> t.x;
        s >> t.y;
        s >> t.z;
        vec.push_back(t);
    }

    // Using vec.begin() and vec.end() does not change anything
    std::sort(vec.data(), vec.data() + vec.size());
}

I can provide you the dataset; however, I wanted to see at first if I have some big conceptual mistake in my code or some error I do not see. I found that the problem is specific to that dataset. If I leave out one entry, the Segfault does not occur. I think that is quite weird, since such a mistake should be more obvious and related to memory management. Also notice that using integers for x,y and z does not cause any problems.

Here is a godbolt version of the code.

Here is a related SO question. However, I think that my code does not have the same flaw that caused this error. I think my ordering relation is "strictly <".

HerpDerpington
  • 3,751
  • 4
  • 27
  • 43
  • 1
    Have you tried running it with ASAN and UBSAN? – erenon May 26 '20 at 12:50
  • 1
    You really should be initializing your `IntVec` member variables by default. – PaulMcKenzie May 26 '20 at 12:51
  • 1
    You test breaks "the strict weak ordering" requirement, thus std::sort has undefined Behaviour – Richard Critten May 26 '20 at 12:51
  • 2
    [Pro Tip]: When comparing multiple meber values, use `std::tie` to do so for you. That lets you have `return std::tie(lhs.z, lhs.y, lhs.x) < std::tie(rhs.z, rhs.y, rhs.x);` which is cleaner and easier to understand. – NathanOliver May 26 '20 at 12:53
  • 2
    *found that the problem is specific to that dataset. If I leave out one entry, the Segfault does not occur.* -- I am not sure what compiler you're using, but Visual C++ debug runtime checks for the strict-weak-order error and issues an `assert()` if strict-weak-order is violated, and not just simply crash. Maybe there is such a compiler switch that allows this check for your compiler toolset. – PaulMcKenzie May 26 '20 at 13:01
  • @PaulMcKenzie I was using `g++-9` – HerpDerpington May 26 '20 at 13:02
  • Does this answer your question? [sort function C++ segmentation fault](https://stackoverflow.com/questions/1541817/sort-function-c-segmentation-fault) – Audrius Meškauskas Sep 14 '21 at 08:03

1 Answers1

13

Your operator's logic is broken (does not satisfy strict weak ordering requirement). The final clause needs a lhs.z == rhs.z too. Otherwise, lhs.z can be > rhs.z but you still get a positive result, leading to an inconsistent ordering.

Standard library algorithms put the onus on you to get this right, and breaking the resulting assumptions can easily lead to chaos (read: undefined behaviour) such as segmentation faults.

Imagine a comment inside the implementation, saying something like "at this point, we know that a is less than b, so we do not need to perform any range/bounds check on b". When a is unexpectedly larger than b, that lack of bounds checking may cause bad memory accesses. The outcome can, however, be far more subtle and cause weird bugs down the line, so it's important to get right.

You might wish to consider using a shorter, less error-prone method of implementing this ordering:

return std::tie(lhs.z, lhs.y, lhs.x) < std::tie(rhs.z, rhs.y, rhs.x);

Using operator< on a tuple (which is what std::tie gives you) automatically (and correctly!) performs the lexicographic breakdown for you.

There's actually a nice example of this on the cppreference page for std::tie, showing that it's a common thing to do.

Employed Russian
  • 199,314
  • 34
  • 295
  • 362
Asteroids With Wings
  • 17,071
  • 2
  • 21
  • 35