What could be the complexity for binary insertion sort? And how many swaps and comparisons are made?
It might be O(n(LG n))
comparisons, but I am not sure. For the worst case, it is indeed N^2
swaps. What about the best?
What could be the complexity for binary insertion sort? And how many swaps and comparisons are made?
It might be O(n(LG n))
comparisons, but I am not sure. For the worst case, it is indeed N^2
swaps. What about the best?
You can write binary insertion sort easily by leveraging built-in functions such as bisect_left
and list.pop(..)
and list.insert(..)
:
def bininssort(L):
n = len(L)
i,j=0,0
for i in range(1,n):
j=i-1
x=L.pop(i)
i1=bisect_left(L,x,0,j+1)
L.insert(i1,x)
return L
About the worst-case, since at the i-th
iteration of the loop, we perform a binary search inside the sub-array A[0..i]
, with 0<=i<n
, that should take log(i)
operations, so we now know we have to insert an element at location i1
and we insert it, but the insertion means we have to push all the elements that follow it one position to the right, and that's at least n-i
operations (it can be more than n-i
operations depending on the insertion location). If we sum up just these two we get \sum_{i=1}^n log(i) + (n-i) = log(n!) + (n*(n+1))/2 ~ n*log(n) + (n*(n+1))/2
(in the above Stirling's approximation of log(n!)
is being used)
Now the wiki page says
As a rule-of-thumb, one can assume that the highest-order term in any given function dominates its rate of growth and thus defines its run-time order
So I think the conclusion would be that in the worst-case the binary insertion sort has O(n^2)
complexity.
See also:
Then I tried to check how it's performing on reversed(n,n-1,n-2,..,1
) and alternating (0,n-1,1,n-2,2,n-3,...
) lists. And I fitted them (using the matchgrowth module) to different growth rates, this part is just an approximation. The reversed order was fitted to polynomial time, and the alternating order was fitted to quasilinear time
The best-case is explained here. If the list is already sorted, then even if we don't do any swaps, all the binary searches are still being performed, which leads to O(n*log(n))
.
The code used here is available in this repository.