10

For applicable data types a good radix sort can beat the pants off comparison sorts by a wide margin but std::sort is usually implemented as introsort. Is there a reason to not use radix sort to implement std::sort? Radix sort doesn't fully suffice for implementing std::sort because std::sort requires only that types be comparable but for types where comparison and radix based sorting produce the same answer (e.g. int) this seems like low hanging fruit that's been left unplucked.

Would it be legal to implement std::sort with overloads that use radix sort when appropriate? Is there something about the requirements of std::sort that fundamentally prevent this?

Edit: I should have been a tad more clear. I'm asking if it would be legal for an implementation of the standard library to do this. I'm not asking about a user of a standard library implementation placing anything in the std namespace. I know that doing so is illegal except in specific cases.

Praxeolitic
  • 22,455
  • 16
  • 75
  • 126
  • 1
    although you can aoverload std::sort, do you really want to mess with the standard namespace? better implement a sort function that calls std::sort when necessery instead. – David Haim Oct 06 '15 at 09:53
  • @DavidHaim Huh? If I was implementing the C++ standard library I would be *required* to mess with the standard namespace. – Praxeolitic Oct 06 '15 at 09:54
  • If you talk standard library implementations, then anything goes. The 'as-if' rule applies. – zch Oct 06 '15 at 09:55
  • @zch Hence the question! Can radix sort satisfy the 'as-if' rule for `std::sort`? – Praxeolitic Oct 06 '15 at 09:56
  • 1
    I think you can do it. `std::stable_sort` would be a different matter though, since it introduces a concept of equality. – Bathsheba Oct 06 '15 at 09:56
  • I am only not sure if you can allocate extra memory, but you surely can do this if memory is available. – zch Oct 06 '15 at 10:12
  • I’m not sure about the rationale for overloading `std::sort` here (and I don’t think it’s legal, either). Why not just have your own function `radix_sort`, or even `sort`, as long as it’s in a namespace different from `std`? – Konrad Rudolph Oct 06 '15 at 10:32
  • @KonradRudolph I'm wondering if the reason standard library implementations don't use radix sort is because it would be illegal to do so. – Praxeolitic Oct 06 '15 at 10:37
  • @Praxeolitic No. The reason that it‘s not used is that it’s not a good general purpose sorting algorithm. It may be good if you have specific information about the input. However, that input goes beyond merely the *date type* (at least as encoded in the C++ type system). I don’t think even `int`s are amenable to more efficient sorting using radix sort than using standard quicksort. – Konrad Rudolph Oct 06 '15 at 11:38
  • Boost.Sort claims that algorithms similar to radix sort can have very good performance... – Marc Glisse Oct 06 '15 at 12:06

1 Answers1

2

The comments quote an "as-if" rule. That's actually not necessary. std::sort isn't specified "as if introsort is used". The specification for std::sort is brief and only requires an effect (sorted) and complexity (O(N log N)) for the number of comparisons. Radix sort meets both.

25.4.1.1 sort

template<class RandomAccessIterator> void sort(RandomAccessIterator first, RandomAccessIterator last);

template<class RandomAccessIterator, class Compare> void sort(RandomAccessIterator first, RandomAccessIterator last, Compare comp);

1 Effects: Sorts the elements in the range [first,last).

2 Requires: RandomAccessIterator shall satisfy the requirements of ValueSwappable (17.6.3.2). The type of *first shall satisfy the requirements of MoveConstructible (Table 20) and of MoveAssignable (Table 22).

3 Complexity: O(N log(N )) (where N == last - first) comparisons.

In practice, comparing two register-width values a<b is a much faster operation than extracting digits and comparing a sequence of those digits, even if we'd use bits or hexadecimal digits. Sure, it's a constant factor difference, but extracting and comparing 32 individual bits is going to be about 100x slower than a direct comparison. That beats most theoretical concerns, especially since log N can't really be 100 on todays computers.

Praxeolitic
  • 22,455
  • 16
  • 75
  • 126
MSalters
  • 173,980
  • 10
  • 155
  • 350
  • zch brought up a good point in the comments -- the radix sort may need to dynamically allocate memory. Does that matter? – Praxeolitic Oct 06 '15 at 12:06
  • @Praxeolitic: No, that's not mentioned by the Standard. Practically speaking, most sorting algorithms do need memory, but often that's taken from the stack. – MSalters Oct 06 '15 at 12:52
  • (Tangential nitpick - performance oriented radix sorts compare more than 1 bit at a time. Naturally, 1 byte is a pretty good choice.) – Praxeolitic Oct 06 '15 at 13:18
  • @Praxeolitic: I had assumed 4 bits ("hexadecimal digits") but 8 could also work. – MSalters Oct 06 '15 at 13:25
  • In some environments allocating big arrays on stack can be impossible, but 8-bit radix should work with most. Bigger radixes could require heap allocation and then the case of failed allocation would need to be handled. Some standard library algorithms have explicitly allowed performance degradation in case of low memory. – zch Oct 06 '15 at 15:05
  • Looks like the wording of the standard is surprisingly terse on the requirements of `std::sort`. Would you mind if I threw in the quote as an edit? – Praxeolitic Oct 07 '15 at 03:40
  • @Praxeolitic: Not at all. – MSalters Oct 07 '15 at 08:00