1

OK, I typically default to assuming that the error is in my code, but what I am seeing is making literally no sense to me.


I have a std::vector<DocumentWidget *>, which I want to be sorted by some relatively simple algorithm. Here's what I am seeing.

The code looks like this:

std::vector<DocumentWidget *> documents = DocumentWidget::allDocuments();

// NOTE(eteran): on my system, I have observed a **consistent** crash
// when documents.size() == 17 while using std::sort.
std::sort(documents.begin(), documents.end(), [](const DocumentWidget *a, const DocumentWidget *b) {

    int rc = (a->filenameSet_ == b->filenameSet_) ? 0 : a->filenameSet_ && !b->filenameSet_ ? 1 : -1;
    if (rc != 0) {
        return rc < 0;
    }
    if(a->filename_ < b->filename_) {
        return true;
    }
    return a->path_ < b->path_;
});

Seems simple enough, but it crashes specifically when I have a 17th item in the list! The sorting predicate clearly is not modifying the vector in any way, so I can't see that being an issue. Address sanitizer and valgrind show no errors up until this point.

qSort does not crash, seems to work fine. No other threads that are running are touching this data, and it happens reliably no matter how slowly I step... so it's not a race condition.

When I look in the debugger, the a parameter seems to be where the "one past the end" iterator is pointing. But that shouldn't be happening if std::sort is behaving Crash.

Note There are 17 items in the std::vector, I forced the debugger to display an 18th item to illustrate where a seems to be coming from.

I can't imagine that std::sort is bugged, but I am really struggling to find another explanation. Am I missing some obvious bug here?!

Evan Teran
  • 87,561
  • 32
  • 179
  • 238
  • 1
    `if(a->filename_ < b->filename_) { return true; }` should likely come with the converse. You can end up with both `a < b` and `b < a`. – Igor Tandetnik Jan 26 '18 at 03:10
  • 1
    Try `return std::tie(a->filenameSet_, a->filename_, a->path_) < std::tie(b->filenameSet_, b->filename_, b->path_);` – Travis Gockel Jan 26 '18 at 03:19

2 Answers2

16
if(a->filename_ < b->filename_) {
    return true;
}
return a->path_ < b->path_;

This is known as "not a valid strict weak ordering":

{ "a", "d" } < { "b", "c"}; because "a" < "b"
{ "b", "c" } < { "a", "d"}; because "c" < "d"

The fix is simple: don't reinvent the wheel:

return std::tie(a->filename_, a->path_) < std::tie(b->filename_, b->path_);
T.C.
  • 133,968
  • 17
  • 288
  • 421
  • The logic of `filenameSet_` can also be tied (assuming it is a `bool` and not a pointer of sorts). – Travis Gockel Jan 26 '18 at 03:21
  • @TravisGockel Yes. I sort of dislike using `<` on `bool`s, but then the convoluted comparison code in the OP isn't that much better. – T.C. Jan 26 '18 at 03:22
  • Definitely a lose-lose scenario. – Travis Gockel Jan 26 '18 at 03:24
  • @T.C. I agreed that the code is a bit convoluted. This is part of a port of a very old codebase, and I wanted to start with using the same exact behavior the original code had. `std::tie` seems to do the job perfectly. Thanks! – Evan Teran Jan 26 '18 at 03:24
1

@T.C.'s answer is a good one since the question is tagged C++11, but in case future readers are after a C++03 solution, the 'canonical' way to write such a comparison operator (for members m1, m2, .... mn) is

if (a->m1 != b-m1)
    return a->m1 < b->m1;
else if (a->m2 != b->m2)
    return a0>m2 < b->m2;
else ...
else if (a->mn != b->mn)
    return a->mn < b->mn;
else
    return false;

(see here for a discussion, with the question having the above style but avoiding operator!=).

One further thing is that you are using pointers and your code is not bomb-proof. If a or b is NULL you'll get a crash. If you can't avoid pointers and you want to be paranoid then you need to add checks like

if (!a && !b)
    return false;
else if (!a)
    return true;
else if (!b)
    return false;
Paul Floyd
  • 5,530
  • 5
  • 29
  • 43