What sorting algorithm is used by programmers?

Question

I am currently studying algorithms at college and I am curios as what does a seasoned developer uses in its code when said programmer needs to sort something.

C++ uses IntroSort which has an average of Θ(n log(n)) and worst of Θ(n^2).
C# uses QuickSort which has an average of Θ(n log(n)) and worst of Θ(n^2).
Java uses MergeSort which has an average of Θ(n log(n)) and worst of Θ(n log(n)).
Javascript seems like its doing Θ(n log(n)) and the algorithm depends on the browser.
And from a quick read, the majority of languages have a sorting method that has a time complexity of Θ(n log(n)).

Do programmers use the default sorting methods or they implement their own ?
When do they use the default one and when do they implement their own ?
Is Θ(n log(n)) the best time a sorting algorithm can get ?

There are a ton of sorting algorithm, as I am currently finding out in uni.

99% of the time, you use the sort that your language's standard library provides. — Paul Hankin, May 15 '21 at 15:18
Comparison-based sorting can be O(n*log(n)) at best. Strings and numbers can be sorted in O(n) using radix sort. But I've only used that once. The rest of the time it's whatever-the-library-provided sort. — , May 15 '21 at 15:38

score 1 · Accepted Answer · answered May 15 '21 at 15:28

I am currently studying algorithms at college and I am curios as what does a seasoned developer uses in its code when said programmer needs to sort something.

Different sorting algorithms have different applications. You choose the best algorithm for the problem you're facing. For example, if you have a list of items in-memory then you can sort them in-place with QuickSort - if you want to sort items that are streamed-in (i.e. an online sort) then QuickSort wouldn't be appropriate.

C++ uses IntroSort which has an average of Θ(n log(n)) and worst of Θ(n^2).

I think you mean C++'s STL sort defaults to using Introsort in most implementations (including the original SGI STL and GNU's, but I don't believe the C++ specification specifically requires sort to use Introsort - it only requires it to be a stable sort. C++ is just a language, which does not have a sorting-algorithm built in to the language. Anyway, it's a library feature, not a language feature.

C# uses QuickSort which has an average of Θ(n log(n)) and worst of Θ(n^2).

Again, C# (the language) does not have any built-in sorting functionality. It's a .NET BCL (Base Class Library) feature that exposes methods that perform the sorting (such as Array.Sort, List<T>.Sort, Enumerable.OrderBy<T>, and so on). Unlike the C++ specification, the C# official documentation does state that the algorithm used by List<T>.Sort is Quicksort, but other methods like Enumerable.OrderBy<T> leave the actual sorting algorithm used to the backend provider (e.g. in Linq-to-SQL and Linq-to-Entities the sorting is performed by the remote database engine).

Do programmers use the default sorting methods or they implement their own ?

Generally speaking, we use the defaults because they're good enough for 95%+ of all workloads and scenarios - or because the specification allows the toolchain and library we're using to pick the best algorithm for the runtime platform (e.g. C++'s sort could hypothetically make use of hardware-sorting which allows for sorting of constrained values of n in O(1) to O(n) worst-case time instead of O(n^2) with QuickSort - which is a problem when processing unsanitized user-input.

But also, generally speaking, programmers should never reimplement their own sorting algorithms. Modern languages with support for templates and generics mean that an algorithm can be written in the general form for us, so we just need to provide the data to be sorted and either comparator function or a sorting key selector, which avoids common programmer human errors when reimplementing a stock algorithm.

As for the possibility of programmers inventing their own new novel sorting algorithms... with few exceptions that really doesn't happen. As with cryptography, if you find yourself "inventing" a new sorting algorithm I guarantee that not only are you not inventing a new algorithm, but that your algorithm will be flawed in some way or another. In short: don't - at least not until you've ran your idea past your nearest computer science academic.

When do they use the default one and when do they implement their own ?

See above. You're also not considering using a non-default algorithm. As the other answers have said, it's based on the application, i.e. the problem you're trying to solve.

Is Θ(n log(n)) the best time a sorting algorithm can get ?

You need to understand the difference between Best-case, Average-case, and Worst-case time complexities. Just read the Wikipedia article section with the big table that shows you the different runtime complexities: https://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_sorts - for example, insertion sort has a best-case time complexity of O(n) which is much better than O(n log n), which directly contradicts your supposition.

There are a ton of sorting algorithm, as I am currently finding out in uni.

I think you would be better served by bringing your questions to your class TA/prof/reader as they know the course material you're using and know the context in which you're asking.

ti7 · Answer 2 · 2021-05-15T15:01:11.360

A different sort is practically chosen based upon what it is sorting and where it is sorting it

whether the data needs to be sorted at all (can all or a subset of the data be inserted in order?)
how sorted the data is already (does it come in sorted chunks?)
whether the data needs to be sorted now or how unsorted it can be before it should be sorted (when? cache compaction during off-peak hours?)
time complexity
space requirements for the sort

Distributed environments are also extremely relevant in modern software, and causes states where not all of the nodes may be available

This greatly changes how things are or if they are fully sorted (for example data may be sliced up to different nodes, partially sorted, and then referenced by some sort of Cuckoo Hash YT)

I would add whether or not the data is already partially sorted. That can impact the effectiveness of different sorting methods. — rossum, May 15 '21 at 14:53

Paul Johnson · Answer 3 · 2021-05-15T15:17:25.193

The standard list sort in Haskell uses merge sort.

Divide the list into "runs"; sections where the input is already in ascending order, or in descending order. The minimum run length is 2, and for random data the average is 3 (I think). Runs in descending order are just reversed, so now you have a list of sorted lists.
Merge each pair of lists. Repeat until you have only one list left.

This is O(n . log n) in the worst case and O(n) if the input is already sorted. Also it is stable.

Matt Timmermans · Answer 4 · 2021-05-15T15:50:36.383

I think I qualify as a seasoned developer. If I just what to sort whatever, I will almost always call the library function. A lot of effort has been put into optimizing it.

Some of the situations in which I will write my own sort include:

When I need to do it incrementally. Insertion-sort each item as it comes in to maintain a sorted sequence, maybe, or use a heap as a priority queue.
When a counting sort or bucket sort will do. In this case it's easy to implement and has lower complexity.
When the keys are integers and speed is very important, I sometimes implement a radix sort.
When the stuff I need to sort doesn't fit in memory (external sorting)
When I need to build a suffix array or otherwise take advantage of special relationships between the keys.
When comparisons are extremely expensive, I will sometimes implement a merge sort to put a good upper bound on how many I have to do.
In a real-time context that is memory constrained, I will sometimes write a heap sort to get in-place sorting with a good upper bound on worst-case execution time.
If I can produce the required ordering as a side-effect of something else that is going on (and it make design sense), then I might take advantage of that instead of doing a separate sort.

score 0 · Answer 5 · answered May 15 '21 at 19:34

In the overwhelming number of cases only the default sorts of a language are used.

When that is not the case it is mostly because the data has some special properties that can be used to reduce the sort time, an even those are then mostly the ordering lambda that is changed.

Some cases where you know that only a few distinct values are have simple O(N) sorting algorithms that could be used.

In decreasing order of generality and increasing order of simplicity.

radix sort
bucket sorts
counting sort

What sorting algorithm is used by programmers?

5 Answers5