11

I have implemented the classic Hoare's partitioning algorithm for Quicksort. It works with any list of unique numbers [3, 5, 231, 43]. The only problem is when I have a list with duplicates [1, 57, 1, 34]. If I get duplicate values I enter an infinite loop.

private void quicksort(int[]a, int lo, int hi) {
    if (lo < hi) {
        int q = hoare_partition(a, lo, hi);
        quicksort(a, lo, q - 1);
        quicksort(a, q + 1, hi);
    }
}

private int hoare_partition(int[] a, int lo, int hi) {

    int pivot = a[hi];
    int i = lo;
    int j = hi;

    while (true) {

        while (a[i] < pivot) i++;
        while (a[j] > pivot) j--;

        if (i < j) swap(a, i, j);
        else return j;
    }
}

Is it possible that Hoare's implementation is incorrect and is unable to cope with duplicates?

I know this has been asked many times (and I tried them all) but I am having difficulties figuring the solution for this implementation.

valdi.k
  • 331
  • 1
  • 3
  • 7

6 Answers6

12
algorithm partition(A, lo, hi) is
    pivot := A[lo]
    i := lo – 1
    j := hi + 1
    loop forever
        do
            i := i + 1
        while A[i] < pivot

        do
            j := j – 1
        while A[j] > pivot

        if i >= j then
            return j

        swap A[i] with A[j]

The pseudocode above is taken from Wikipedia. Let's compare it with your code.

The problem is that you have to move the indices after the swap. The pseudocode uses do-while loop instead of while loop to move the indices after the swap. Also pay attention to the initial values of i and j.

algorithm quicksort(A, lo, hi) is
    if lo < hi then
        p := partition(A, lo, hi)
        quicksort(A, lo, p)
        quicksort(A, p + 1, hi)

For the recursive step, you might need to take care of the edge cases (i.e., the indices). It should work if you change quicksort(a, lo, q-1) to quicksort(a, lo, q).

A complete working version I just wrote:

import java.util.Arrays;

public class Test {

    public static void main(String[] args) throws Exception {
        int[] input = {1, 57, 1, 34};
        quicksort(input, 0, input.length - 1);
        System.out.println(Arrays.toString(input));
    }

    private static void quicksort(int[]a, int lo, int hi) {
        if (lo < hi) {
            int q = hoare_partition(a, lo, hi);
            quicksort(a, lo, q);
            quicksort(a, q + 1, hi);
        }
    }

    private static int hoare_partition(int[] a, int lo, int hi) {

        int pivot = a[lo];
        int i = lo - 1;
        int j = hi + 1;

        while (true) {
            do {
                i++;
            }
            while (a[i] < pivot);

            do {
                j--;
            }
            while (a[j] > pivot);

            if (i >= j) {
                return j;
            }
            swap(a, i, j);

        }
    }

    private static void swap(int[] a, int i, int j) {
        int temp = a[i];
        a[i] = a[j];
        a[j] = temp;
    }

}

If you prefer the while loop instead of do-while:

private static int hoare_partition(int[] a, int lo, int hi) {

            int pivot = a[lo];
            int i = lo;
            int j = hi;

            while (true) {

                while (a[i] < pivot) i++;

                while (a[j] > pivot) j--;

                if (i >= j) {
                    return j;
                }
                swap(a, i, j);
                i++;
                j--;

            }
        }
Terry Li
  • 16,870
  • 30
  • 89
  • 134
  • 1
    Thank you very much for your answer it was extremely helpful. I was missing the part where you move the indices after a swap which `do...while` solves. Would you suggest having the normal `while` and do `i++; j--;` right after you use `swap(a,i,j);` instead of `do...while`? – valdi.k Nov 02 '16 at 11:18
  • @ValentinK. Absolutely. It's just a matter of style. I updated the answer in case you'd like to stick to the `while` loop :) – Terry Li Nov 02 '16 at 14:12
3

An address to @Terry Li's answer above as my reputation is not sufficient to comment. First of all, thank you very much for the detailed post. However, I believe there are some issues with the alternative function with while loop you provided. It doesn't return the index of the original pivot and it is inaccurate per se. (Please try with hoare_partition([6,7,8,9,1,2,3,4,5], 0, 8)) The problem is caused by incrementing i and decrementing j at the same time, causing you to lose track of the pivot. Hence in the proposed fix below, I have inserted a small condition to ensure whichever index storing the pivot is not changed. Please correct me if I'm wrong.

private static int hoare_partition(int[] a, int lo, int hi) {

            int pivot = a[lo];
            int i = lo;
            int j = hi;

            while (true) {

                while (a[i] < pivot) i++;

                while (a[j] > pivot) j--;

                if (i >= j) {
                    return i;
                }
                swap(a, i, j);
                if (a[j] == pivot)
                    i++;
                elif (a[i] == pivot)
                    j--;

            }
        }
chau-phan
  • 801
  • 1
  • 6
  • 12
  • I ran his code on your input and it gives the correct output. Can you please explain what was the problem here? Also, in the Hoare partition, the index that is returned is not always the pivot. – Sameer Dec 06 '21 at 16:57
2

Here is a C++ example that implements Hoare scheme plus a median of 3 check, a duplicate of pivot check to exclude middle values equal to pivot (note that values equal to pivot can end up anywhere, not just the middle, so this doesn't help much), only using recursion on the smaller part, looping back for the larger part (this prevents stack overflow). Worst case time complexity is still O(n^2), but it takes fairly specific data patterns to produce this (median of 3 would have to consistently keep picking near smallest or near largest values).

void QuickSort(uint32_t a[], size_t lo, size_t hi) {
    while(lo < hi){
        size_t i = lo, j = (lo+hi)/2, k = hi;
        uint32_t p;
        if (a[k] < a[i])            // median of 3
            std::swap(a[k], a[i]);
        if (a[j] < a[i])
            std::swap(a[j], a[i]);
        if (a[k] < a[j])
            std::swap(a[k], a[j]);
        p = a[j];
        i--;                        // Hoare partition
        k++;
        while (1) {
            while (a[++i] < p);
            while (a[--k] > p);
            if (i >= k)
                break;
            std::swap(a[i], a[k]);
        }
        i = k++;
        while(i > lo && a[i] == p)  // exclude middle values == pivot
            i--;
        while(k < hi && a[k] == p)
            k++;
        // recurse on smaller part, loop on larger part
        if((i - lo) <= (hi - k)){
            QuickSort(a, lo, i);
            lo = k;
        } else {
            QuickSort(a, k, hi);
            hi = i;
        }
    }
}
rcgldr
  • 27,407
  • 3
  • 36
  • 61
1

I know this post is old, but I believed I've figured out a solution. The problem arises when hoare_partition immediately finds that a[i] at i = lo needs to be swapped out of its position, but it never finds a suitable a[j] to swap with. We know this has occured if q == lo. This issue arises when the pivot is the smallest value in the array and is a duplicate. To address this, we check if q == lo in quicksort, and if so, we manually swap a[lo] with a[pivot_index] to make sure that a[lo] is moved out of its position and into a valid position. We only need to make one recursive call, since the left partition will be of size 1.

I also made slight modifications to the the conditions in the inner while loops. You need to check if i < j in the inner loops, in addition to the outer loop. The second inner while now checks if a[j] is >= pivot, rather than strictly >. The last recursive call uses q instead of q+1 as the new lo. I took pivot_index as a parameter. Lastly, I returned j instead of i.

private void quicksort(int[] a, int lo, int hi) {
  if (lo < hi) {
    // int pivot_index = (rand() % (hi - lo + 1)) + lo; // randomized pivot_index is better. should be in range [lo, hi] inclusive 
    int pivot_index = hi; 
    int q = hoare_partition(a, lo, hi, pivot_index);

    if (q == lo) {
      swap(a, lo, pivot_index);
      quicksort(a, lo + 1, hi);

    } else {
      quicksort(a, lo, q - 1);
      quicksort(a, q, hi);

     }
  }
}

private int hoare_partition(int[] a, int lo, int hi, int pivot_index) {
  int pivot = a[pivot_index];
  int i = lo;
  int j = hi;

  while (true) {
    
    while (i < j && a[i] < pivot) i++;
    while (i < j && a[j] >= pivot) j--;
    
    if (i < j) swap(a, i, j);
    else return j;
  }
}
jake
  • 63
  • 7
0

Here is one more example with while loop

private static int partition(int[] a, int start, int end){
   int i = start-1, j = end+1;

   while(true){
      while(a[++i] < a[start]);
      while(a[start] < a[--j]);

      if (i >= j) return j;
      swap(a, i, j);
   }
}
jbakirov
  • 936
  • 2
  • 10
  • 15
  • How does this answer `Is it possible that Hoare's implementation is incorrect and is unable to cope with duplicates?` – greybeard Dec 29 '21 at 05:47
0

Is it possible that Hoare's implementation is incorrect and is unable to cope with duplicates?

That's correct, Hoare's algorithm breaks when the pivot appears multiple times in the array.

One fix is to use a three-way partitioning algorithm. Instead of returning a single index, it returns two indices: Where the pivot section begins, and where it ends.

serg06
  • 2,027
  • 1
  • 18
  • 26
  • As worded, this answer is misleading instead of useful. Algorithms are valid solutions by definition. Implementations may be broken. As originally stated, *quicksort with Hoare partitioning* may use a huge number of nested calls. Take an input of, say 999 identical values to see what happens with such an input. – greybeard Dec 29 '21 at 05:46