3

I have an std::vector as one of the inputs for an API i am exposing. I know that the user of this API can send a huge vector, but that vector was formed by concatenation of sorted vectors. This means that the vector that I get is formed from a number of sorted vectors.

I need to sort this vector. I would like to know which sorting algorithm is best suited. I would prefer an in-place sorting algo like merge or quick as I dont want to take up more memory (the vector is already a huge one).

Also would it be better to change the API interface to accept N sorted vectors and then do the N-way merging myself. I dont want to go with this unless the saving is really huge. Also while doing N-way merge I would want to do it in-place if possible.

So ideally i would prefer the approach where i use some ready sort algorithm on the big vector (as that would be simpler I feel).

Daniel Daranas
  • 22,454
  • 9
  • 63
  • 116
AMM
  • 17,130
  • 24
  • 65
  • 77

3 Answers3

2

Take a look at std::inplace_merge. You can use mergesort idea and merge each pair, then next pairs, then next… And so on until only one remains.

Artem Sobolev
  • 5,891
  • 1
  • 22
  • 40
1

You can search the vector to find the concatenation points of the smaller vectors. Then by using these iterators you can do a merge one by one.

To find the concatenation points you can look for the first element that violates the sorting criteria from the beginning. And then from that position to the next and so on..

0

Timsort looks to be just what you need -- it is an adaptive sort that looks for presorted runs in the data, and merges them as it goes. It has worst-case O(nlog n) performance, and I expect it will do much better than that if the runs (presorted subarrays) are long.

j_random_hacker
  • 50,331
  • 10
  • 105
  • 169