LSD radix sort for negative integers without queue

Question

First of all, i know there is a similar question over here: Radix Sort for Negative Integers

however it is not duplicate to this one.

I am studying radix sorts and have a question regarding the implementation of LSD radix sort by Prof. Sedgewick and Prof. Wayne.

public static void sort(int[] a) {
    final int BITS = 32;                 // each int is 32 bits 
    final int R = 1 << BITS_PER_BYTE;    // each bytes is between 0 and 255
    final int MASK = R - 1;              // 0xFF
    final int w = BITS / BITS_PER_BYTE;  // each int is 4 bytes

    int n = a.length;
    int[] aux = new int[n];

    for (int d = 0; d < w; d++) {         

        // compute frequency counts
        int[] count = new int[R+1];
        for (int i = 0; i < n; i++) {           
            int c = (a[i] >> BITS_PER_BYTE*d) & MASK;
            count[c + 1]++;
        }

        // compute cumulates
        for (int r = 0; r < R; r++)
            count[r+1] += count[r];

        // for most significant byte, 0x80-0xFF comes before 0x00-0x7F
        if (d == w-1) {
            int shift1 = count[R] - count[R/2];
            int shift2 = count[R/2];
            for (int r = 0; r < R/2; r++)
                count[r] += shift1;
            for (int r = R/2; r < R; r++)
                count[r] -= shift2;
        }

        // move data
        for (int i = 0; i < n; i++) {
            int c = (a[i] >> BITS_PER_BYTE*d) & MASK;
            aux[count[c]++] = a[i];
        }

        // copy back
        for (int i = 0; i < n; i++)
            a[i] = aux[i];
}

What is going on with the most significant byte? It is far more elegant than anything i came up with.

I am not confident in my ability to explain that block of code, it is obvious that it deals with negative numbers but i am not exactly sure how.

Could somebody explain that block of code in greater detail ?

UPDATE

I think i got additionally confused naming of variables shift1 and shift2. If we rename things a bit, and add a comment or two:

 if (d == w-1) {
            int totalNegatives= count[R] - count[R/2];
            int totalPositives= count[R/2];
            for (int r = 0; r < R/2; r++)
                // all positive number must come after any negative number
                count[r] += totalNegatives;
            for (int r = R/2; r < R; r++)
                // all negative numbers must come before any positive number
                count[r] -= totalPositives;
        }

this becomes easier to follow.

The idea is that first positive number can only be in position after last negative number, and all positive numbers must be after negative ones in sorted order. Therefore we simply need to add count of total negative numbers to all positives in order to ensure that positive numbers will indeed come after negatives. Same analogy for negatives numbers.

Comment - suggested improvements, if count was a matrix: `count[w][R+1]`, then only a single read pass would be needed to generate all the counts, and then the indexes (the cumulative sums). Also a[] and aux[] are references, so a[] and aux[] could be swapped (using something like int[] tmp) instead of doing a copy back. As long as the number of radix sort passes is even, then the sorted result will end up back in a[]. These suggestions would also work for C / C++ (assuming pointers used to reference arrays). — rcgldr, Aug 07 '18 at 23:22
If java doesn't optimize matrix usage well, the sort could be hard coded for a specific instance of `w`, using `w` instances of counts, for `w` == 4, then: count0[], count1[], count2[], count3[]. — rcgldr, Aug 07 '18 at 23:25

Vincent van der Weele · Accepted Answer · 2018-08-07T11:07:34.320

Basic algorithm

Let's start by ignoring the block for the most significant bit and try to understand the rest of the code.

The algorithms handles the integers byte by byte. Every byte can have 256 different values, which are counted separately. This is what happens in the first block. After

int[] count = new int[R+1];
for (int i = 0; i < n; i++) {           
    int c = (a[i] >> BITS_PER_BYTE*d) & MASK;
    count[c + 1]++;
}

every count[i] is the number of elements in a that have value i-1 in their dth byte (note that they use count[c + 1]++, so count[0] == 0)

The algorithm then continues to compute the cumulative counts with

for (int r = 0; r < R; r++)
    count[r+1] += count[r];

After this, every count[i] is the index where the first element of that bucket should end up in the (intermediate) output. (Note that count has length 257 (R + 1), so the last element of the cumulative array can be ignored. I'll put it in brackets in the examples below.) Let's look at an example with 4 values (instead of 256, to keep it concise):

Consider an array with byte values [0, 3, 3, 2, 1, 2]. This gives counts [0, 1, 1, 2, 2] and cumulative counts [0, 1, 2, 4, (6)]. These are exactly the indices of the first 0, 1, 2, and 3 in the sorted array (which would be [0, 1, 2, 2, 3, 3]).

Now the algorithm uses those cumulative counts as indices in the (intermediate) output. It increments the bucket index whenever it copies an element from that bucket, so elements from the same bucket are copied to consecutive spots.

for (int i = 0; i < n; i++) {
    int c = (a[i] >> BITS_PER_BYTE*d) & MASK;
    aux[count[c]++] = a[i];
}

for (int i = 0; i < n; i++)
    a[i] = aux[i];

Handling the sign bit

The most significant bit is a bit special because in two's complement it is the sign, which is 1 for negative numbers and 0 for positive numbers. So the cumulative array count is incorrect for the final step. The counts for values whose most significant bit are 0 (the positive numbers) are in the first half of the array and the counts for the values whose most significant bit are 1 (the negative numbers) are in the second half of the array. Therefore, the first half and the second half of the array must be "flipped".

This is achieved by adding the total number of elements in the second half of the counts array to each element in the first half of the counts array. And by subtracting the total number of elements in the first half of the counts array from each element in the second half of the counts array. As noted earlier, the counts array has length 257, so the first 128 elements (257 / 2) are the first half and the remaining 129 elements are the second half.

Let's look at a new example, now with two-bits signed values, i.e., -2, -1, 0, and 1. The binary representation for these is 10, 11, 00, 01, so mapped to unsigned numbers that is 2, 3, 0, 1, respectively.

Consider and array a as [0, -1, -1, -2, 1, -2]. Convert to unsigned: [0, 3, 3, 2, 1, 2]. Apply the algorithm to get the cumulative counts: [0, 1, 2, 4, (6)]. If we would not do the flipping, we would end up with the sorted unsigned array [0, 1, 2, 2, 3, 3], which is equivalent to the signed array [0, 1, -2, -2, -1, -1]. That's not properly sorted.

Now, let's apply the extra step for the signed bytes. We split the cumulative counts array in two halves: [0, 1] and [2, 4, (6)]. There are 2 (2 - 0) elements in the first half and 4 (6 - 2) elements in the second half. So we add 4 to each element in the first half: [4, 5] and subtract 2 from each element in the second half: [0, 2, (4)]. Combining the halves gives [4, 5, 0, 2, (4)].

If we now use these counts as indices in the final unsigned array, we get [2, 2, 3, 3, 0, 1] (the first 0 is at index 4, the first 1 at index 5, and so on). Converting this back to signed values gives [-2, -2, -1, -1, 0, 1], which is indeed correct.

Possible confusion: one of the confusing parts in this algorithm is that the counts array is used for two different purposes. First it's used to count separate occurrences and later it's used to count cumulative occurrences. When counting separately, the first element of the array is not used. When counting cumulatively, the last element of the array is not used.

I think the algorithm would have been simpler if two separate arrays were used instead.

Thank you very much for detailed answer. Could you please expand on how flipping solves the problem , what is the rationale behind this? — John, Aug 07 '18 at 11:04
@John I tried to make the example a bit more explicit. Hope that helps. If you haven't yet, check up on two's complement because that's the reason that the flipping is necessary and also the reason why this algorithm works. — Vincent van der Weele, Aug 07 '18 at 11:08
Thank you very much for this. Yeah i agree with you for the confusion part and also noticed this last part/first part. — John, Aug 07 '18 at 11:36

LSD radix sort for negative integers without queue

1 Answers1