0

In the scenario below, is Java async-profiler the right tool to see where's time spent when comparing performance of ArrayBlockingQueue and LinkedBlockingQueue?

On my machine, total execution time of ABQ is always 25% faster than LBQ when sharing 50M entries between a consumer and a producer. Flame graphs of both are "pretty much" same except LBQ one shows only a handful of samples from JVM object allocation code but this wouldn't jusify 25% increase. As expected, TLAB allocation in LBQ is much higher.

I was wondering, how can I see which activity (be it code or hardware) is taking the time?

Runner:

import java.util.*;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;

public class Runner {

    public static void main(String[] args) throws InterruptedException {


        int size = 50_000_000;
        BlockingQueue<Long> queue = new LinkedBlockingQueue<>(size);

        Producer producer = new Producer(queue, size);
        Thread t = new Thread(producer);
        t.setName("ProducerItIs");

        Consumer consumer = new Consumer(queue, size);
        Thread t2 = new Thread(consumer);
        t2.setName("ConsumerItIs");


        t.start();
        t2.start();

        Thread.sleep(8000);
        System.out.println("done");
        queue.forEach(System.out::println);
        System.out.println(queue.size());
    }
}

Producer:

import java.util.Queue;
import java.util.Random;
import java.util.concurrent.BlockingQueue;

public class Producer implements Runnable {

    public Producer(BlockingQueue<Long> blockingQueue, int size) {
        this.queue = blockingQueue;
        this.size = size;

    }

    private final BlockingQueue<Long> queue;
    private final int size;

    public void run() {

        System.out.println("Started to produce...");
        long nanos = System.nanoTime();
        Long ii = (long) new Random().nextInt();

        for (int j = 0; j < size; j++) {
                queue.add(ii);
        }
        System.out.println("producer Time taken :" + ((System.nanoTime() - nanos) / 1e6));
    }
}

Consumer:

import java.util.concurrent.BlockingQueue;

public class Consumer implements Runnable {

    private final BlockingQueue<Long> blockingQueue;
    private final int size;


    private Long value;

    public Consumer(BlockingQueue<Long> blockingQueue, int size) {
        this.blockingQueue = blockingQueue;
        this.size = size;
    }

    public void run() {
        long nanos = System.nanoTime();

        System.out.println("Starting to consume...");
        int i = 1;
        try {
            while (true) {
                value = blockingQueue.take();
                i++;

                if (i >= size) {
                    break;
                }

            }
            System.out.println("Consumer Time taken :" + ((System.nanoTime() - nanos)/1e6));
        } catch (Exception exp) {
            System.out.println(exp);
        }
    }
    public long getValue() {
        return value;
    }
}

With ArrayBlockingQueue: enter image description here

With LinkedListBlockedQueue: Black arrow showing samples captured for allocations enter image description here

Abidi
  • 7,846
  • 14
  • 43
  • 65
  • Do you have an example? It will be easier to asnwer, if you post the code for a particular problem. Otherwise the question is too abstract. – apangin Feb 23 '22 at 13:44
  • @apangin Please have a look now. – Abidi Feb 23 '22 at 17:29
  • My results differ from yours. Can you post the complete compilable code, including WaitUtil.busySleep, and also tell what version of JDK you are running on? – apangin Feb 25 '22 at 22:28
  • Also, take a look at the number of samples on each graph. Even if the shape looks similar, the graphs may differ in absolute numbers. Besides that, since your benchmark is not only CPU-bound, but also involves synchronization/waiting - it makes sense to profile in wall clock mode (add `wall` option in profiler arguments). – apangin Feb 25 '22 at 22:31
  • @apangin, I've updated accordingly, I'm running on jdk1.8.0_211.jdk on macbook. I understand, LBQ version has more samples (2K more than ABQ), but wouldn't it show what was happening differently that caused it? – Abidi Feb 28 '22 at 20:46

0 Answers0