5

First I'll summarize what I've found so far.

  • This answer suggests that changing the concurrencyLevel parameter of ConcurrentHashMap's constructor might help. I've tried that and my code still hanged.
  • Answers here suggest that it could be a runtime bug.

What I'm trying to do:

  • I have 10 worker threads running along with a main thread. The worker threads will have to process many arrays to find the index of the max element in the array (if there are multiple max values, the first occurrence will be used). Among these "many arrays," some of them can be duplicate, so I'm trying to avoid those full array scans to speed up the program.
  • The controller class contains a ConcurrentHashMap that maps the hash values of arrays to the corresponding max-element indices.
  • The worker threads will ask the controller class for the mapped index first before trying to calculate the index by doing full array scans. In the latter case, the newly calculated index will be put into the map.
  • The main thread does not access the hash map.

What happened:

  • My code will hang after 70,000 ~ 130,000 calls to getMaxIndex(). This count is obtained by putting a log string into getMaxIndex() so it might not be exactly accurate.
  • My CPU usage will gradually go up for ~6 seconds, and then it will go down to ~10% after peaked at ~100%. I have plenty of unused memory left. (Does this look like deadlock?)
  • If the code does not use map it works just fine (see getMaxIndex() version 2 below).
  • I've tried to add synchronized to getMaxIndex()'s signature and use the regular HashMap instead, that also did not work.
  • I've tried to use different initialCapacity values too (e.g. 50,000 & 100,000). Did not work.

Here's my code:

// in the controller class
int getMaxIndex(@NotNull double[] arr) {
    int hash = Arrays.hashCode(arr);

    if(maxIndices.containsKey(hash)) {
        return maxIndices.get(hash);
    } else {
        int maxIndex =
            IntStream.range(0, arr.length)
                .reduce((a, b) -> arr[a] < arr[b] ? b : a)
                .orElse(-1); // -1 to let program crash

        maxIndices.put(hash, maxIndex);
        return maxIndex;
    }
}

The worker thread will call getMaxIndex() like this: return remaining[controller.getMaxIndex(arr)];, remaining is just another int array.

getMaxIndex() v2:

int getMaxIndex(@NotNull double[] arr) {
    return IntStream.range(0, arr.length)
        .reduce((a, b) -> arr[a] < arr[b] ? b : a)
        .orElse(-1); // -1 to let program crash
}

JVM info in case it matters:

java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)

EDIT: stack dump; I used Phaser to synchronize the worker threads, so some of them appear to be waiting on the phaser, but pool-1-thread-2, pool-1-thread-10, pool-1-thread-11, and pool-1-thread-12 do not appear to be waiting on the phaser.

Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.151-b12 mixed mode):

"Attach Listener" #23 daemon prio=9 os_prio=0 tid=0x00007f0c54001000 nid=0x4da2 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"pool-1-thread-13" #22 prio=5 os_prio=0 tid=0x00007f0c8c2cb800 nid=0x4d5e waiting on condition [0x00007f0c4eddd000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076e792f40> (a java.util.concurrent.Phaser$QNode)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
    at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
    at Ant.call(Ant.java:77)
    at Ant.call(Ant.java:10)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"pool-1-thread-12" #21 prio=5 os_prio=0 tid=0x00007f0c8c2ca000 nid=0x4d5d waiting on condition [0x00007f0c4eede000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x0000000775518738> (a java.util.concurrent.SynchronousQueue$TransferStack)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
    at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
    at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"pool-1-thread-11" #20 prio=5 os_prio=0 tid=0x00007f0c8c2c8000 nid=0x4d5c waiting on condition [0x00007f0c4efdf000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x0000000775518738> (a java.util.concurrent.SynchronousQueue$TransferStack)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
    at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
    at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"pool-1-thread-10" #19 prio=5 os_prio=0 tid=0x00007f0c8c2c6000 nid=0x4d5b waiting on condition [0x00007f0c4f0e0000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x0000000775518738> (a java.util.concurrent.SynchronousQueue$TransferStack)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
    at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
    at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"pool-1-thread-9" #18 prio=5 os_prio=0 tid=0x00007f0c8c2c4800 nid=0x4d5a waiting on condition [0x00007f0c4f1e1000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076e7c74f8> (a java.util.concurrent.Phaser$QNode)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
    at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
    at Ant.call(Ant.java:77)
    at Ant.call(Ant.java:10)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"pool-1-thread-8" #17 prio=5 os_prio=0 tid=0x00007f0c8c2c2800 nid=0x4d59 waiting on condition [0x00007f0c4f2e2000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076e64fb78> (a java.util.concurrent.Phaser$QNode)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
    at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
    at Ant.call(Ant.java:77)
    at Ant.call(Ant.java:10)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"pool-1-thread-7" #16 prio=5 os_prio=0 tid=0x00007f0c8c2c1000 nid=0x4d58 waiting on condition [0x00007f0c4f3e3000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076e8b44c8> (a java.util.concurrent.Phaser$QNode)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
    at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
    at Ant.call(Ant.java:77)
    at Ant.call(Ant.java:10)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"pool-1-thread-6" #15 prio=5 os_prio=0 tid=0x00007f0c8c2bf800 nid=0x4d57 waiting on condition [0x00007f0c4f4e4000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076e5b4500> (a java.util.concurrent.Phaser$QNode)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
    at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
    at Ant.call(Ant.java:77)
    at Ant.call(Ant.java:10)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"pool-1-thread-5" #14 prio=5 os_prio=0 tid=0x00007f0c8c2bd800 nid=0x4d56 waiting on condition [0x00007f0c4f5e5000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076e836958> (a java.util.concurrent.Phaser$QNode)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
    at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
    at Ant.call(Ant.java:77)
    at Ant.call(Ant.java:10)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"pool-1-thread-4" #13 prio=5 os_prio=0 tid=0x00007f0c8c2bc000 nid=0x4d55 waiting on condition [0x00007f0c4f6e6000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076e4f4cf0> (a java.util.concurrent.Phaser$QNode)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
    at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
    at Ant.call(Ant.java:77)
    at Ant.call(Ant.java:10)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"pool-1-thread-3" #12 prio=5 os_prio=0 tid=0x00007f0c8c2ba000 nid=0x4d54 waiting on condition [0x00007f0c4f7e7000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076e40abb8> (a java.util.concurrent.Phaser$QNode)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
    at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
    at Ant.call(Ant.java:77)
    at Ant.call(Ant.java:10)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"pool-1-thread-2" #11 prio=5 os_prio=0 tid=0x00007f0c8c2b8800 nid=0x4d53 waiting on condition [0x00007f0c4f8e8000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x0000000775518738> (a java.util.concurrent.SynchronousQueue$TransferStack)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
    at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
    at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"pool-1-thread-1" #10 prio=5 os_prio=0 tid=0x00007f0c8c2b5800 nid=0x4d52 waiting on condition [0x00007f0c4f9e9000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076e486ab0> (a java.util.concurrent.Phaser$QNode)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
    at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
    at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
    at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
    at Ant.call(Ant.java:77)
    at Ant.call(Ant.java:10)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

"Service Thread" #9 daemon prio=9 os_prio=0 tid=0x00007f0c8c200800 nid=0x4d50 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread2" #8 daemon prio=9 os_prio=0 tid=0x00007f0c8c1fd800 nid=0x4d4f waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" #7 daemon prio=9 os_prio=0 tid=0x00007f0c8c1f8800 nid=0x4d4e waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #6 daemon prio=9 os_prio=0 tid=0x00007f0c8c1f7800 nid=0x4d4d waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Monitor Ctrl-Break" #5 daemon prio=5 os_prio=0 tid=0x00007f0c8c1fb000 nid=0x4d4c runnable [0x00007f0c781b4000]
   java.lang.Thread.State: RUNNABLE
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
    at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
    at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
    - locked <0x000000077550ecb0> (a java.io.InputStreamReader)
    at java.io.InputStreamReader.read(InputStreamReader.java:184)
    at java.io.BufferedReader.fill(BufferedReader.java:161)
    at java.io.BufferedReader.readLine(BufferedReader.java:324)
    - locked <0x000000077550ecb0> (a java.io.InputStreamReader)
    at java.io.BufferedReader.readLine(BufferedReader.java:389)
    at com.intellij.rt.execution.application.AppMainV2$1.run(AppMainV2.java:64)

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f0c8c181000 nid=0x4d49 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f0c8c14d800 nid=0x4d42 in Object.wait() [0x00007f0c78564000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x0000000775500d08> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
    - locked <0x0000000775500d08> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f0c8c149000 nid=0x4d41 in Object.wait() [0x00007f0c78665000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x0000000775500d48> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:502)
    at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
    - locked <0x0000000775500d48> (a java.lang.ref.Reference$Lock)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

"main" #1 prio=5 os_prio=0 tid=0x00007f0c8c00c800 nid=0x4d35 waiting on condition [0x00007f0c91f77000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000076dd5e268> (a java.util.concurrent.FutureTask)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
    at java.util.concurrent.FutureTask.get(FutureTask.java:191)
    at java.util.concurrent.AbstractExecutorService.invokeAll(AbstractExecutorService.java:244)
    at ConcurrentACS.loop(ConcurrentACS.java:138)
    at ConcurrentACS.compute(ConcurrentACS.java:165)
    at ConcurrentACS.main(ConcurrentACS.java:192)

"VM Thread" os_prio=0 tid=0x00007f0c8c141800 nid=0x4d3f runnable 

"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f0c8c022000 nid=0x4d37 runnable 

"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f0c8c024000 nid=0x4d38 runnable 

"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007f0c8c025800 nid=0x4d39 runnable 

"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007f0c8c027800 nid=0x4d3a runnable 

"VM Periodic Task Thread" os_prio=0 tid=0x00007f0c8c205800 nid=0x4d51 waiting on condition 

JNI global references: 272
  • 1
    The number "70000-100000" when you hit the problem is interesting. At about 70000 elements, there is a 50% chance of getting a hash code collision (c.f. birthday problem). Maps handle collisions OK, but it seems more than a coincidence. Do you have an expensive hash code calculation (if you have one at all)? – Bohemian Nov 17 '17 at 16:35
  • @Bohemian I'm not sure what you meant by "expensive hash code calculation." I'm simply calling `Arrays.hashCode()` on my `double` arrays. Is that what you referred to? –  Nov 17 '17 at 18:52
  • The Runnable thread blocked indefinitely when I used jdk 1.6.x version. After moving to JDK 1.7.x, I did not get this issue. – Ravindra babu Nov 18 '17 at 06:11
  • @Ravindrababu Sorry are you saying that running my code blocked or that you have experiences something similar in the past? –  Nov 18 '17 at 07:22
  • I shared my experience. Take 10 threads dumps with few minutes gap and check any Runnable thread is running in all 10 snapshots with access to chm. – Ravindra babu Nov 18 '17 at 07:29
  • @Gray Sorry, neither your's answer nor Nikita's answer solved the issue, but they are equally helpful, so I upvoted both a while back. I'm not sure if it's appropriate to accept an answer in this case. What do you think? –  Dec 10 '17 at 03:39
  • Certainly up to you. Neither solved the issue but did either answer the question? Here's a good meta question/answer for you to consider: https://meta.stackexchange.com/a/5235/152851 – Gray Dec 10 '17 at 13:59

2 Answers2

1

is it possible for ConcurrentHashMap to hang?

The short answer is no if by "hang" you mean some sort of program loop or deadlock. If you are implying that you have discovered a race condition (bug) in that code that would cause it to hang during normal JVM and system execution then I seriously doubt it.

I suspect that there is something else going on and just because you are using a CHM in the version that is hanging shouldn't imply that the class has a bug. I would use stack dumps or a profiler to show that the code is locked on a CHM line before I'd cast any blame that way.

Is it possible to be calling CHM at some large number of times per second so that the performance of your program suffers because of it? Sure. But it wouldn't hang in that it is stuck or deadlocked.

My CPU usage will gradually go up for ~6 seconds, and then it will go down to ~10% after peaked at ~100%. I have plenty of unused memory left. (Does this look like deadlock?)

Your now posted stack trace shows that no threads are locked in CHM code so it doesn't look to be the problem. The performance curve you are talking about seems to be happening because of the fork/join thread-pool that you are using initially starts X threads but then some of them finish their tasks and exit. This is to be expected. It has nothing to do with the CHM.

if(maxIndices.containsKey(hash)) {
   return maxIndices.get(hash);

Just a quick comment. This code makes 2 calls to the CHM instead of something like:

Integer maxIndex = maxIndices.get(hash);
if (maxIndex != null) {
   return maxIndex;
}
...

But that's just inefficient and wouldn't cause a bug. Also, it is important to recognize that race conditions in your code means that multiple threads might get a null for the index and calculate the index value. But also this is not a bug which would cause a "hang".

Gray
  • 115,027
  • 24
  • 293
  • 354
  • I just added the thread dump (obtained using `jstack`). I couldn't figure out anything from it, so please take a look if you would, thx! Also I only spawned 10 threads, I don't why there appears to be 13 threads--maybe that will give a hint at my mistake? –  Nov 17 '17 at 18:51
  • I see nothing in the stack trace that indicates that CHM is your problem @user8680580. – Gray Nov 18 '17 at 17:28
  • True. I think I'll stop wasting more time on this. This optimization might not help that much with performance anyway. Thanks again for your help : ) –  Nov 18 '17 at 18:08
  • 1
    Yeah I think the performance curve you are talking about is more about the fork/join outer thread-pool than anything else. CHM is not the problem @user8680580. – Gray Nov 18 '17 at 19:00
  • Your optimization is definitely valid and worth considering. By "this optimization" I actually was referring to my idea of using hash map to skip full array scans, which is the context of my problem. Whether that can really enhance performance is one thing, but at least hashing arrays is not that easy--collisions are common. So what I was trying to say is that instead of going down that path I'll just do full array scans for now. This weird deadlock bug of mine is hard to fix anyway. –  Nov 18 '17 at 23:49
0

The first version isn't thread safe cause your check-than-act sequence isn't atomic. Try to use this implementation:

  private final Map<Integer, Integer> maxIndices = new ConcurrentHashMap<>();

  int getMaxIndex(final double[] arr) {
    // make sure the content of the arr can't be modified concurrently
    // otherwise create a copy of the array in this method
    int hash = Arrays.hashCode(arr);
    return maxIndices.computeIfAbsent(hash,
        key -> IntStream.range(0, arr.length).reduce((a, b) -> arr[a] < arr[b] ? b : a).orElse(-1));
  }
Mikita Harbacheuski
  • 2,193
  • 8
  • 16
  • I've tried your code but it also did not work...Thanks for the idea on atomicity though. Going off from this, why do we still need atomicity when using CHM? If the actions are atomic wouldn't a regular `HashMap` work too? –  Nov 17 '17 at 18:45
  • Sorry I was wrong on the last point. Atomicity alone is not enough, we still need some sort of synchronization. –  Nov 17 '17 at 19:23
  • 2
    You don't need additional synchronization when CHM is used, however multiple serial calls should be atomic cause other thread can change state of CHM in between. So CHM provides several method which performs compound actions atomically like `computeIfAbsent` and `putIfAbsent`. According to the thread dump you use don't have running cached thread pool threads at all. Some of them are waiting for work other on phaser. None of them is blocked on CHM. In what moment did you make the thread dump ? – Mikita Harbacheuski Nov 18 '17 at 12:03
  • I captured the dump after the CPU usage went down, so presumably the program has already deadlocked (or whatever is happening). I don't have a good explanation on why none of the threads is blocking on CHM operations, so I also am leaning toward saying that it is my own mistake causing the issue, but what puzzles me is that if I don't use a map at all (CHM or HM + `synchronized`), my code runs just fine. Also, as I said in my reply to Gray, I don't know why there are 13 threads running, as I only spawned 10 threads. –  Nov 18 '17 at 16:47
  • 1
    Seems that you use cached thread pool, is it correct? If so additional threads can be created under load. According to the thread dump i would say that Phaser might be an issue if you rely on number of 10 somewhere. – Mikita Harbacheuski Nov 19 '17 at 10:52
  • That is very likely as `Phaser` shows up a lot in the thread dump. Still why the usage of map would cause my issue is still a mystery... –  Nov 19 '17 at 16:52
  • it's hard to say without full code sample, probably it makes sense to post it here or as a separate topic. – Mikita Harbacheuski Nov 19 '17 at 19:47