2

I fail to build guava v21.0 as the test concurrent.ListenerCallQueueTest hangs for ever:

$ git clone https://github.com/google/guava
$ cd guava
$ git tag
$ git checkout v21.0
$ mvn package

[...]
Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.177 sec
Running com.google.common.util.concurrent.ListenerCallQueueTest

It seems I have three sleeping processes:

$ ps aux | grep guava | grep -v grep

john     23619 16.6 12.5 4531216 1016192 pts/1 Sl+  07:47   4:43 /usr/lib/jvm/java-8-openjdk-amd64/bin/java -classpath /usr/share/maven/boot/plexus-classworlds-2.x.jar -Dclassworlds.conf=/usr/share/maven/bin/m2.conf -Dmaven.home=/usr/share/maven -Dmaven.multiModuleProjectDirectory=/home/john/Libs/guava org.codehaus.plexus.classworlds.launcher.Launcher clean install
john     26401  0.0  0.0   4292   756 pts/1    S+   07:55   0:00 /bin/sh -c cd /home/john/Libs/guava/guava-tests && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx1536M -Duser.language=hi -Duser.country=IN -jar /home/john/Libs/guava/guava-tests/target/surefire/surefirebooter6901955962891879666.jar /home/john/Libs/guava/guava-tests/target/surefire/surefire4499145904440222523tmp /home/john/Libs/guava/guava-tests/target/surefire/surefire2603198880854108081tmp
john     26403 68.8 14.4 4137984 1167932 pts/1 Sl+  07:55  14:07 /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx1536M -Duser.language=hi -Duser.country=IN -jar /home/john/Libs/guava/guava-tests/target/surefire/surefirebooter6901955962891879666.jar /home/john/Libs/guava/guava-tests/target/surefire/surefire4499145904440222523tmp /home/john/Libs/guava/guava-tests/target/surefire/surefire2603198880854108081tmp

This situation occurs on my debian stretch machine:

$ uname -a

Linux front 4.8.0-1-amd64 #1 SMP Debian 4.8.5-1 (2016-10-28) x86_64 GNU/Linux

$ mvn -version

Apache Maven 3.3.9
Maven home: /usr/share/maven
Java version: 1.8.0_121, vendor: Oracle Corporation
Java home: /usr/lib/jvm/java-8-openjdk-amd64/jre
Default locale: en_GB, platform encoding: UTF-8
OS name: "linux", version: "4.8.0-1-amd64", arch: "amd64", family: "unix"

The same procedure leads to a successful build on my debian jessie machine:

Apache Maven 3.0.5
Maven home: /usr/share/maven
Java version: 1.8.0_111, vendor: Oracle Corporation
Java home: /usr/lib/jvm/java-8-openjdk-amd64/jre
Default locale: en_GB, platform encoding: UTF-8
OS name: "linux", version: "3.16.0-4-amd64", arch: "amd64", family: "unix"

Following suggestions found in this post I have achieved a successful build with:

$ mvn clean install -Dmaven.test.skip

However running mvn clean install instead of mvn clean package leads to the same result.

Any suggestion is greatly appreciated.

Sven Williamson
  • 1,094
  • 1
  • 10
  • 19
  • Hmm, thanks for reporting this. Can you try a couple things? 1) Run `mvn clean install` with `-Dtest.include="**/ListenerCallQueueTest.java"` and see if it still hangs when running just that test. 2) When you do have a hang, run `jstack ` on the `` for the process that is directly running `java ...surefirebooter` (26403 in your example above), and share the resulting stack dump? – Chris Povirk Mar 27 '17 at 13:55
  • 1
    @ChrisPovirk 1) Running `mvn clean install` with `-Dtest.include="**/ListenerCallQueueTest.java"` leads to a hanging situation with 2 processes. I have run `jstack` on both. 2) Running `mvn clean install` leads to an (already reported) hanging situation. Unfortunately `jstack` fails on the process you care about (so I ran `jstack -F`). Just in case, I ran `jstack` on the other two processes. All results can be found [here](https://github.com/possientis/Prog/tree/master/guava). Please let me know if you need anything else. – Sven Williamson Mar 28 '17 at 06:47
  • Thanks, that's great. Are you building at tag `v21.0`? – Chris Povirk Mar 28 '17 at 13:14
  • 1
    @ChrisPovirk My `git status` says "HEAD detached at v21.0, nothing to commit, working tree clean." (I previously ran `git checkout v21.0`). I am not totally sure about this, but I assume this means I am. – Sven Williamson Mar 28 '17 at 16:15
  • I think so, thanks. It looks like your `-Dtest.include` run may have successfully run `ListenerCallQueueTest` and then moved on to the GWT tests (which take something like 8 hours to run, so they look like *another* hang). I wonder if maybe Maven is running out of memory or something when it runs a bunch of tests before `ListenerCallQueueTest`. (I haven't been able to reproduce the problem, so an environmental cause seems plausible.) If so, we might see evidence in `guava-tests/target/surefire-reports/com.google.common.util.concurrent.ListenerCallQueueTest-output.txt`. Can you share yours? – Chris Povirk Mar 29 '17 at 17:32
  • I should also say explicitly: It's unlikely that you're doing anything wrong. Feel free to just comment out the problematic test locally for your purposes :) (Also, pass `-Dgwt.test.include=**/DoesNotMatchAnyTests.java` so that you don't have to wait 8 hours for GWT tests. I'll disable them by default soon, but that won't affect `v21.0`.) I'm hoping that there's no problem with `ListenerCallQueue` or its tests, just with a machine that happens to allocate insufficient memory or something, but I'm interested in case there is a real problem there. It's up to you if you want to help me dig. – Chris Povirk Mar 29 '17 at 17:39
  • 1
    @ChrisPovirk 1. When running `mvn clean install` and I eventually stop the hanging build, no surefire-reports is produced it seems (if you think I am wrong let me know and I ll try again). When running `mvn clean install -Dtest.include="**/ListenerCallQueueTest.java" -Dgwt.test.include="**/DoesNotMatchAnyTests.java"` then as you expected, the build is succesful and reports are [generated](https://github.com/possientis/Prog/tree/master/guava). Let me know if you need me to try anything else. – Sven Williamson Mar 29 '17 at 18:51
  • Thanks. When I edit the test to intentionally hang on my machine, I see a partial `surefire-reports` file that cuts off mid-line. (Even calling `flush()` on the logger's handlers doesn't help, nor does switching to `System.err.println`.) That suggests that we're not guaranteed to see any output until the test completes (maybe even later). Of course, it's also possible that the test is hanging before it logs anything... hard to know :( The next thing to try is probably to remove every `queue.add(THROWING_CALLBACK);` line from the test and see if it still hangs then. But I'm guessing wildly now. – Chris Povirk Mar 29 '17 at 19:46
  • 1
    @ChrisPovirk Having commented out the line `queue.add(THROWING_CALLBACK)` (8 lines) in `ListenerCallQueueTest.java` and running a fresh `mvn clean install`, the test no longer hangs, but `ServiceManagerTest` is now hanging. While it is hanging the two corresponding surefire reports are empty. After I kill the hanging build, the two corresponding surefire reports are still empty. – Sven Williamson Mar 30 '17 at 07:30
  • 1
    @ChrisPovirk Similarly, when running `mvn clean install` on the original code (no line is commented out), while hanging the two surefire reports relating to `ListenerCallQueueTest` are empty. A few seconds after I kill the hanging build, they are still empty. – Sven Williamson Mar 30 '17 at 07:35
  • Thanks. `ServiceManagerTest` is using `ListenerCallQueue`, so it's still possible that there's a bug in `ListenerCallQueue`. We recently edited `ListenerCallQueue`, so I'm wondering if that helps. What happens if you build at `master` instead of `v21.0`? Just so that we're talking about the same commit, let's say `git fetch && git checkout 99d61226b6da2f50d3b5b2c80434b6f93a82899e`. After that, you can `git checkout v21.0` to get back where you were. – Chris Povirk Mar 31 '17 at 16:03
  • 1
    @ChrisPovirk on that commit, the `ListenerCallQueueTest` are successful, but the build `mvn clean install` eventually hangs with `Running com.google.gwt.junit.tools.GWTTestSuite@548a9f61` , `Starting Jetty on port 0`, `[WARN] ServletContainerInitializers: detected. Class hierarchy: empty`. I have saved the `ps` and `jstack` reports [here](https://github.com/possientis/Prog/tree/master/guava) – Sven Williamson Mar 31 '17 at 21:30
  • Thanks again. I am out of ideas for now :( I am checking with a couple colleagues (one of whom is out this week) for other suggestions. – Chris Povirk Apr 03 '17 at 17:42
  • @ChrisPovirk I am very sorry, for some reasons I have had to install the same OS in two other identical machines and I was then able to install `v21.0` without any problem. So it looks like your hypothesis of 'environmental cause' is very likely. I feel I have been wasting your time :( – Sven Williamson Apr 04 '17 at 06:03

0 Answers0