I'm currently writing a quick sort algorithm in Java to sort random arrays of integers and then timing them using System.nanoTime(). The size of these arrays are powers of ten, starting from 10^3 and ending at 10^7. Furthermore, the random lists have different properties. I am sorting purely random lists, lists with some identical values (fewUnique), reversed sorted lists, sorted lists and nearly sorted lists.
The sort works. It performs a quick sort recursively on the array until it needs to sort 30 elements or less of the array, in which case it performs an insertion sort.
All was fine for 10^3 and 10^4 but once I got to 10^5 values it would only sort the random, few unique and random lists but would incur a Stack Overflow error when sorting nearly sorted and sorted lists.
I don't believe the issue lies in the way the lists are generated because the Stack Overflow occurs within the sorting algorithm (the line which is being referred to by the compiler is the first within the findPivot() method). Also, it will often sort between 1 and 6 lists before crashing. It must therefore be some way that the algorithm itself interacts with nearly sorted and sorted lists. Also, the generation for reversed lists involves invoking the code for creating a sorted list (and then reversing it).
Also, I find it unlikely that the issue is just to the algorithm having to, for some reason, invoke the partitioning off of sections of the array by recursion in a nearly sorted and sorted list significantly more times rather than the other list types, as it can sort random lists with 10^7 values, which would require far more partitioning than would a nearly sorted list with 10^5 values.
I realise it must have something to do with how these list types interact with the recursion of my quick sort, but if someone could shed light on it that would be great. I've put the code to both the quick sort algorithm in full and the random list generators below.
QUICK SORT ALGORITHM
/**
* Performs a quick sort given the indexes of the bounds of an integer array
* @param arr The array to be sorted
* @param highE The index of the upper element
* @param lowE The index of the lower element
*/
public static void quickSort(int[] arr, int highE, int lowE)
{
//Only perform an action if arr.length > 30, otherwise insertion sort [recursive base case])
if (lowE + 29 < highE)
{
//Get the element and then value of the pivot
int pivotE = findPivot(arr, highE, lowE);
int pivotVal = arr[pivotE], storeE = lowE;
//Swap the pivot and the last value.
swapElements(arr, pivotE, highE);
//For each element in the selection that is not the pivot, check to see if it is lower than the pivot and if so, move it to the leftmost untouched element.
for (int i = lowE; i < highE; i++)
{
if (arr[i] < pivotVal)
{
swapElements(arr, storeE, i);
//Increment storeE so that the element that is being switched moves to the right location
storeE++;
}
}
//Finally swap the pivot into its proper position and recrusively call quickSort on the lesser and greater portions of the array
swapElements(arr, storeE, highE);
//Lesser
quickSort(arr, storeE - 1, lowE);
//Greater
quickSort(arr, highE, storeE + 1);
}
else
{
insertSort(arr, highE, lowE);
}
}
/**
* Finds the pivot element
* @param arr The array to be sorted
* @param highE The index of the top element
* @param lowE The index of the bottom element
* @return The index of the pivot.
*/
public static int findPivot(int[] arr, int highE, int lowE)
{
//Finds the middle element
int mid = (int) Math.floor(lowE + (highE - lowE) / 2);
//Returns the value of the median of the first, middle and last elements in the array.
if ((arr[lowE] >= arr[mid]) && (arr[lowE] >= arr[highE]))
{
if (arr[mid] > arr[highE]) {return mid;}
else {return highE;}
}
else if ((arr[mid] >= arr[lowE]) && (arr[mid] >= arr[highE]))
{
if (arr[lowE] > arr[highE]) {return lowE;}
else {return highE;}
}
else
{
if (arr[lowE] > arr[mid]) {return lowE;}
}
return mid;
}
/**
*Performs an insertion sort on part of an array
* @param arr The array to be sorted.
* @param highE The index of the top element.
* @param lowE The index of the low element.
*/
public static void insertSort(int[] arr, int highE, int lowE)
{
//Sorts elements lowE to i in array, with i being gradually incremented.
for (int i = lowE + 1; i <= highE; i++)
{
for (int j = i; j > lowE; j--)
{
if (arr[j] < arr[j - 1])
{
swapElements(arr, j, j-1);
}
else {break;}
}
}
}
RANDOM LIST GENERATORS
/**
* Creates a random list
* @param arr The array to place the list inside of
*/
public static void randomList(int[] arr)
{
//Places a random number at each element of the array
for (int i = 0; i < arr.length; i++)
{
arr[i] = (int) Math.floor(Math.random() * RAND_MAX);
}
}
/**
* Creates a nearly sorted list of random numbers
* @param arr the array to place the list inside of
*/
public static void nearSortList(int[] arr)
{
//Creates a sorted list in arr
sortList(arr);
int swaps = (int) (Math.ceil(Math.pow((Math.log(arr.length)), 2.0)));
//The two values to be switched each time
int a, b;
//Performs a number of swaps equal to swaps [log(N) ^ 2] rounded up, with numbers switched no more than ln(N) places away
for (int i = 0; i < swaps; i++)
{
a = (int) Math.floor(Math.random() * arr.length);
b = (int) (a + Math.random() * 2 * Math.log(arr.length) - Math.log(arr.length));
//Accounts for cases in which b is either greater or smaller than the array bounds
if (b < 0)
{
b = -b;
}
else if (b >= arr.length)
{
b = -1 * (arr.length - b);
}
swapElements(arr, a, b);
}
}
/**
* Creates a random list with many unique values in
* @param arr the array to place the list inside of
*/
public static void fewUniqueList(int[] arr)
{
int[] smallArr = new int[(int) Math.floor(Math.pow(arr.length, 9.0 / 10.0))];
//Creates a smaller array of random values
randomList(smallArr);
//Fills the larger list up with values from the smaller list, ensuring aproximately N / N ^ (9/10) instances of each number in the array and ensuring, at most, there are N ^ (9/10) (rounded down) unique values in the large array
for (int i = 0; i < arr.length; i++)
{
arr[i] = smallArr[(int) Math.floor(Math.random() * smallArr.length)];
}
}
/**
* Creates a reversed list of random numbers
* @param arr the array to place the list inside of
*/
public static void reversedList(int[] arr)
{
//Creates a sorted list in arr
sortList(arr);
//Switches each ith elements with its mirror on the other end of the array until the value of i reaches the middle of the array
for (int i = 0; i < (int) (arr.length / 2.0); i++)
{
swapElements(arr, i, arr.length - 1 - i);
}
}
/**
* Creates a sorted list of random numbers using a merge sort
* @param arr the array to place the list inside of
*/
public static void sortList(int[] arr)
{
//Creates a random list in arr
randomList(arr);
Arrays.sort(arr);
}
EDIT: [Defunct]
EDIT 2:
I've replaced the basic recursive calls with the following code in order to only call the smallest of the two partitions at EJPs recommendation and it still is not fixing the issue.
if (storeE - 1 - lowE < highE - storeE + 1)
{
//Lesser
quickSort(arr, storeE - 1, lowE);
//Greater
quickSort(arr, highE, storeE + 1);
}
else
{
//Greater
quickSort(arr, highE, storeE + 1);
//Lesser
quickSort(arr, storeE - 1, lowE);
}
EDIT 3:
Ok, it is now evident that the recursion depth is far greater than I gave it credit for for nearly sorted and sorted lists. But now I need to work out why this is the case, and why random lists only had a depth of 28 for 10^7 values but nearly sorted and sorted lists have depths of over 3000