0

I'm implementing a sweep-line algorithm in Java 8, in which I traverse a collection of objects sorted by a numeric key (e.g., the "xlo" coordinate of a rectangle). Currently, I'm implementing this using a PriorityQueue<item>( key-extractor ). I realize, though, that I'm not really using the full functionality of the PriorityQueue (ability to intersperse add's, delete's and peek's/poll's), but only adding the items at the start of execution, then proceeding to peek/poll them later (no further insertions or deletions). As such, I'm thinking of replacing the PriorityQueue with a simple Queue<item>, fill it up, sort it, and then peek/poll the items from the front.

I'm going to try modifying my code along these lines, but before I do, I'm wondering how much I might expect to save by the substitution. Formally, it seems like the complexity of the two approaches are equivalent, but I'm suspecting there's a significant different in the constant. Suggestions?

Mark Lavin
  • 1,002
  • 1
  • 15
  • 30

1 Answers1

1

Thanks, @pjs, for the suggestion. I used a simple class Number to force the sorting to use an accessor method as my application does. Here are the results for 1000 trials of adding/sorting/retrieving various numbers of Numbers using a PriorityQueue<Number>, an ArrayList<Number> (with explicit sort) and an array of Numbers (with explicit sort); numbers are in seconds:

For 1000 trials    PriorityQueue   ArrayList    Array
n=100              0.021           0.024        0.033
n=1000             0.220           0.203        0.205
n=10000            3.157           2.772        2.751

So, the answer is "it depends". For a small number of Numbers in the collection, PriorityQueue is the winner. For larger numbers, array is the winner.

Here is the code I used; note that I repeated all the trials twice, to allow the JIT to work its magic, and reported times from the second repetition.

public class Benchmark {

    static private final int n = 10000;  // Number of Numbers
    static private final int r = 1000; // Number of repetitions

    static private Random rand = new Random();

    static private class Number {
        private int i;
        public Number( int i ) { this.i = i; }
        public int getNumber() { return i; }
    }

    static private PriorityQueue< Number > 
    sortedNumbersPQ = new PriorityQueue< Number >( Comparator.comparing( Number::getNumber ) );
    static private List< Number > sortedNumbersAL = new ArrayList<>();
    static private Number[] sortedNumbersAR = new Number[ n ];

    static private int usePriorityQueue( int[] numbers ) {
        int sum = 0;
        for ( int i = 0; i < numbers.length; i++ )
            sortedNumbersPQ.add( new Number( numbers[ i ] ) );
        while ( ! sortedNumbersPQ.isEmpty() )
            sum += sortedNumbersPQ.poll().getNumber();
        return sum;
    }
    static private int useArrayList( int[] numbers ) {
        int sum = 0;
        sortedNumbersAL.clear();
        for ( int i = 0; i < numbers.length; i++ )
            sortedNumbersAL.add( new Number( numbers[ i ] ) );
        Collections.sort( sortedNumbersAL, Comparator.comparing( Number::getNumber ) );
        for ( int i = 0; i < numbers.length; i++ )
            sum += sortedNumbersAL.get( i ).getNumber();
        return sum;
    }
    static private int useArray( int[] numbers ) {
        int sum = 0;
        for ( int i = 0; i < numbers.length; i++ )
            sortedNumbersAR[ i ] = new Number( numbers[ i ] );
        Arrays.sort( sortedNumbersAR, 0, numbers.length, Comparator.comparing( Number::getNumber ) );
        for ( int i = 0; i < numbers.length; i++ )
            sum += sortedNumbersAL.get( i ).getNumber();
        return sum;
    }
    static public void main( String args[] ) {
        int[] numbers = new int[ n ];
        for ( int i = 0; i < n; i++ )
            numbers[ i ] = rand.nextInt( 1000000 );
        long start = System.currentTimeMillis();
        for ( int i = 0; i < r; i++ )
            usePriorityQueue( numbers );
        System.err.println( "Using PriorityQueue for " + r + " repeats of " + n + " items: " + 
                0.001 * ( System.currentTimeMillis() - start ) );
        start = System.currentTimeMillis();
        for ( int i = 0; i < r; i++ )
            useArrayList( numbers );
        System.err.println( "Using ArrayList for " + r + " repeats of " + n + " items: " + 
                0.001 * ( System.currentTimeMillis() - start ) );
        start = System.currentTimeMillis();
        for ( int i = 0; i < r; i++ )
            useArray( numbers );
        System.err.println( "Using Array for " + r + " repeats of " + n + " items: " + 
                0.001 * ( System.currentTimeMillis() - start ) );

        start = System.currentTimeMillis();
        for ( int i = 0; i < r; i++ )
            usePriorityQueue( numbers );
        System.err.println( "Using PriorityQueue for " + r + " repeats of " + n + " items: " + 
                0.001 * ( System.currentTimeMillis() - start ) );
        start = System.currentTimeMillis();
        for ( int i = 0; i < r; i++ )
            useArrayList( numbers );
        System.err.println( "Using ArrayList for " + r + " repeats of " + n + " items: " + 
                0.001 * ( System.currentTimeMillis() - start ) );
        start = System.currentTimeMillis();
        for ( int i = 0; i < r; i++ )
            useArray( numbers );
        System.err.println( "Using Array for " + r + " repeats of " + n + " items: " + 
                0.001 * ( System.currentTimeMillis() - start ) );
    }
}
Mark Lavin
  • 1,002
  • 1
  • 15
  • 30
  • 1
    That’s far away from a reasonable benchmark. See [How do I write a correct micro-benchmark in Java?](https://stackoverflow.com/q/504103/2711488). Further note that you are measuring the costs of `clear()` for the `ArrayList`, which reasonable code wouldn’t use. You’d do either, overwrite using `set` or just create a new `ArrayList` instead. Besides that, the tendency is expected. Both, `ArrayList` and `Arrays.sort` use the same code (TimSort), which runs somewhere between linear time (best case) and the worst case edge of O(n log n), whereas the priority queue will always end up at O(n log n) – Holger Oct 28 '19 at 13:13
  • 1
    In general, you would expect both `Array` and `ArrayList` to outperform the `PriorityQueue`. I suspect that the `PriorityQueue` "win" at 100 items is an artifact of your testing method. – Jim Mischel Oct 29 '19 at 18:25