24

When implementing Insertion Sort, a binary search could be used to locate the position within the first i - 1 elements of the array into which element i should be inserted.

How would this affect the number of comparisons required? How would using such a binary search affect the asymptotic running time for Insertion Sort?

I'm pretty sure this would decrease the number of comparisons, but I'm not exactly sure why.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Derrek Whistle
  • 701
  • 2
  • 7
  • 16

5 Answers5

32

Straight from Wikipedia:

If the cost of comparisons exceeds the cost of swaps, as is the case for example with string keys stored by reference or with human interaction (such as choosing one of a pair displayed side-by-side), then using binary insertion sort may yield better performance. Binary insertion sort employs a binary search to determine the correct location to insert new elements, and therefore performs ⌈log2(n)⌉ comparisons in the worst case, which is O(n log n). The algorithm as a whole still has a running time of O(n2) on average because of the series of swaps required for each insertion.

Source:

http://en.wikipedia.org/wiki/Insertion_sort#Variants

Here is an example:

http://jeffreystedfast.blogspot.com/2007/02/binary-insertion-sort.html

I'm pretty sure this would decrease the number of comparisons, but I'm not exactly sure why.

Well, if you know insertion sort and binary search already, then its pretty straight forward. When you insert a piece in insertion sort, you must compare to all previous pieces. Say you want to move this [2] to the correct place, you would have to compare to 7 pieces before you find the right place.

[1][3][3][3][4][4][5] ->[2]<- [11][0][50][47]

However, if you start the comparison at the half way point (like a binary search), then you'll only compare to 4 pieces! You can do this because you know the left pieces are already in order (you can only do binary search if pieces are in order!).

Now imagine if you had thousands of pieces (or even millions), this would save you a lot of time. I hope this helps. |=^)

  • 9
    It still doesn't explain why it's actually O(n^2), and Wikipedia doesn't cite a source for that sentence. – seldon May 21 '18 at 11:48
  • 1
    @mattecapu Insertion Sort is a heavily study algorithm and has a known worse case of O(n^2). Using Binary Search to support Insertion Sort improves it's clock times, but it still takes same number comparisons/swaps in worse case. Intuitively, think of using Binary Search as a micro-optimization with Insertion Sort. – But I'm Not A Wrapper Class May 21 '18 at 12:25
  • Right, I didn't realize you really need a lot of swaps to move the element. So the sentences seemed all vague. Sorry for the rudeness. – seldon May 21 '18 at 14:39
  • 2
    What if insertion sort is applied on linked lists then worse case time complexity would be (nlogn) and O(n) best case, this would be fairly efficient. – Deepak Yadav Jun 11 '18 at 21:12
  • but as wiki said we cannot random access to perform binary search on linked list – Gintama Aug 08 '18 at 22:09
  • I think the goal is to use an ArrayList or a dynamic array. This way you can apply both techniques (insertion and binary search) together. – But I'm Not A Wrapper Class Aug 09 '18 at 00:42
  • The reason for n^2 runtime comes because you need to insert it at an arbitrary location. You may use a tree data structure to make the insertion time to log n. Balanced Binary search tree actually optimizes this insertion sort and makes the runtime complexity to n log n. https://en.wikipedia.org/wiki/Binary_search_tree – sayem siam Mar 25 '21 at 15:15
  • @sayemsiam Correct. I believe the context of this answer is that you want to sort the array or list without creating additional memory or changing the data structure. – But I'm Not A Wrapper Class Mar 25 '21 at 22:27
  • "When you insert a piece in insertion sort, you must compare to all previous pieces." I think this is wrong. If you want to insert "11" in the array, you just have to compare it with "5". But in the worst case, you have to compare with all previous pieces. @ButI'mNotAWrapperClass – Burakhan Aksoy Jul 03 '21 at 10:30
10

If you have a good data structure for efficient binary searching, it is unlikely to have O(log n) insertion time. Conversely, a good data structure for fast insert at an arbitrary position is unlikely to support binary search.

To achieve the O(n log n) performance of the best comparison searches with insertion sort would require both O(log n) binary search and O(log n) arbitrary insert.

Patricia Shanahan
  • 25,849
  • 4
  • 38
  • 75
  • 6
    if you use a balanced binary tree as data structure, both operations are O(log n). – Karoly Horvath Aug 02 '13 at 21:29
  • 11
    But then, you've just implemented heap sort. – Oscar Smith Aug 01 '16 at 23:12
  • 1
    @OscarSmith but Heaps don't provide O(log n) binary search. At least neither Binary nor Binomial Heaps do that. The heaps only hold the invariant, that the parent is greater than the children, but you don't know to which subtree to go in order to find the element `X` in the heap, when it's smaller that the parent/root. I would say, that HeapSort is an improvement of SelectionSort rather than the InsertionSort, since in Selection sort we are looking for the next biggest/smallest number in the subarray from [i:n-1], but heaps allow us to find the next max/min and remove it in O(log n) time – denis631 Jul 28 '17 at 08:47
  • @OscarSmith, If you use a tree as a data structure, you would have implemented a binary search tree not a heap sort. – sayem siam Mar 25 '21 at 15:10
4

Binary Insertion Sort - Take this array => {4, 5 , 3 , 2, 1}

Now inside the main loop , imagine we are at the 3rd element. Now using Binary Search we will know where to insert 3 i.e. before 4.

Binary Search uses O(Logn) comparison which is an improvement but we still need to insert 3 in the right place. For that we need to swap 3 with 5 and then with 4.

Due to insertion taking the same amount of time as it would without binary search the worst case Complexity Still remains O(n^2). I hope this helps.

Armaan
  • 101
  • 10
2

Assuming the array is sorted (for binary search to perform), it will not reduce any comparisons since inner loop ends immediately after 1 compare (as previous element is smaller). In general the number of compares in insertion sort is at max the number of inversions plus the array size - 1.

Since number of inversions in sorted array is 0, maximum number of compares in already sorted array is N - 1.

WastedPandoo
  • 153
  • 1
  • 10
  • 2
    insertion sort keeps the processed elements sorted. that doesn't mean that in the beginning the *whole* array is already sorted. if it were so, you wouldn't need sorting :/ – Karoly Horvath Aug 02 '13 at 23:27
  • 2
    That's a funny answer, sort a sorted array. – dud3 Jan 12 '17 at 19:50
  • the worst case is if you are already sorted for many sorting algorithms and it isn't funny at all, sometimes you are asked to sort user input which happens to already be sorted. +1 – Coder Dec 25 '20 at 03:20
0

For comparisons we have log n time, and swaps will be order of n. For n elements in worst case : n*(log n + n) is order of n^2.