1

We are using Wildfly 10.1.0 Final.

We encountered an OutOfMemoryError caused by threads kept growing.

After examining the thread dump.

We found that there are thousands of Remoting "endpoint" task-N threads.

What are Remoting "endpoint" task-N threads for?

Are they created by jobss-remoting?

After restarting the server, we found that in the begining, there were only 16 threads of them:

Remoting "endpoint" task-1 ~ Remoting "endpoint" task-16.

After the server run for serveral days or months, there may be hundred or thousands of Remoting threads:

A snippet of thread dump is listed below.

In this thread dump, there are several "Remoting "endpoint" task-11" with different number.

So are other tasks such as task-1 to task-16.

All these threads were doing nothing but waiting.

"Remoting "endpoint" task-11" #55415 daemon prio=5 os_prio=0 tid=0x00007f2b8c0a8000 nid=0x276e waiting on condition [0x00007f280a36c000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabee2b8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


"Remoting "endpoint" task-11" #55417 daemon prio=5 os_prio=0 tid=0x00007f2ba003f800 nid=0x276d waiting on condition [0x00007f2794818000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabecf40> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


"Remoting "endpoint" task-11" #55414 daemon prio=5 os_prio=0 tid=0x00007f2b98023800 nid=0x276b waiting on condition [0x00007f2792bfc000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabeda50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


"Remoting "endpoint" task-10" #55411 daemon prio=5 os_prio=0 tid=0x00007f2ba003e000 nid=0x276a waiting on condition [0x00007f27926f7000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabecf40> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


"Remoting "endpoint" task-10" #55413 daemon prio=5 os_prio=0 tid=0x00007f2b8c0a7000 nid=0x2769 waiting on condition [0x00007f27927f8000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabee2b8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


"Remoting "endpoint" task-10" #55412 daemon prio=5 os_prio=0 tid=0x00007f2b98022800 nid=0x2768 waiting on condition [0x00007f27c4815000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabeda50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


"Remoting "endpoint" task-9" #55372 daemon prio=5 os_prio=0 tid=0x00007f2c7408f000 nid=0x41df waiting on condition [0x00007f27907d8000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabece88> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


"Remoting "endpoint" task-8" #55369 daemon prio=5 os_prio=0 tid=0x00007f2c7408d000 nid=0x41dd waiting on condition [0x00007f27909da000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabece88> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


"Remoting "endpoint" task-7" #55368 daemon prio=5 os_prio=0 tid=0x00007f2c7408b000 nid=0x41dc waiting on condition [0x00007f2790adb000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabece88> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


"Remoting "endpoint" task-6" #55367 daemon prio=5 os_prio=0 tid=0x00007f2c74089000 nid=0x41db waiting on condition [0x00007f2790bdc000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabece88> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


"Remoting "endpoint" task-5" #55366 daemon prio=5 os_prio=0 tid=0x00007f2c74087000 nid=0x41da waiting on condition [0x00007f2790cdd000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabece88> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


"Remoting "endpoint" task-4" #55365 daemon prio=5 os_prio=0 tid=0x00007f2c74085000 nid=0x41d9 waiting on condition [0x00007f2790dde000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabece88> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


"Remoting "endpoint" task-9" #55364 daemon prio=5 os_prio=0 tid=0x00007f2bd813c000 nid=0x41d8 waiting on condition [0x00007f2790edf000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabed500> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)


"Remoting "endpoint" task-9" #55363 daemon prio=5 os_prio=0 tid=0x00007f2bf4044000 nid=0x41d7 waiting on condition [0x00007f2790fe0000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006eabee3c0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

....

20180903

I found that "Remoting "endpoint" task" threads are created by "xnio". And I found there is an issue of xnio that is very similar to our scenario:

https://issues.jboss.org/browse/XNIO-285

It says this issue has been fixed in "xnio 3.6.0.Beta1". Unfortunately, Wildfly 10.1.0 is using xnio 3.4.0. When I tried to upgrade to xnio 3.6.5, I got an java.lang.NoClassDefFoundError of org/wildfly/common/context/Contextual. After upgrading wildfly-common-1.4.0.Final.jar which contains the class "org/wildfly/common/context/Contextual", NoClassDefFoundError was still there.

Is there any other way to prevent Remoting "endpoint" task threads from growing?

2 Answers2

0

You might be using scoped EJB context for Remote Method Execution.

Every scoped EJB context will create new thread and simply calling context.close() method won't close the context so you are getting OutOfMemoryError

How to close scoped EJB client contexts?

The answer is the same, use the close() method on the EJB client context. But the real question is how do you get the relevant scoped EJB client context which is associated with a JNDI context. Before we get to that, it's important to understand how the ejb: JNDI namespace that's used for EJB lookups and how the JNDI context (typically the InitialContext that you see in the client code) are related. The JNDI API provided by Java language allows "URL context factory" to be registered in the JNDI framework (see this for details http://docs.oracle.com/javase/jndi/tutorial/provider/url/factory.html). Like that documentation states, the URL context factory can be used to resolve URL strings during JNDI lookup. That's what the ejb: prefix is when you do a remote EJB lookup. The ejb: URL string is backed by a URL context factory.

Internally, when a lookup happens for a ejb: URL string, a relevant javax.naming.Context is created for that ejb: lookup. Let's see some code for better understanding:

// JNDI context "A"
Context jndiCtx = new InitialContext(props);
// Now let's lookup a EJB
MyBean bean = jndiCtx.lookup("ejb:app/module/distinct/bean!interface");

So we first create a JNDI context and then use it to lookup an EJB. The bean lookup using the ejb: JNDI name, although, is just one statement, involves a few more things under the hood. What's actually happening when you lookup that string is that a separate javax.naming.Context gets created for the ejb: URL string. This new javax.naming.Context is then used to lookup the rest of the string in that JNDI name.

Let's break up that one line into multiple statements to understand better:

// Remember, the ejb: is backed by a URL context factory which returns a Context for the ejb: URL (that's why it's called a context factory)
final Context ejbNamingContext = (Context) jndiCtx.lookup("ejb:");
// Use the returned EJB naming context to lookup the rest of the JNDI string for EJB
final MyBean bean = ejbNamingContext.lookup("app/module/distinct/bean!interface");

As you see above, we split up that single statement into a couple of statements for explaining the details better. So as you can see when the ejb: URL string is parsed in a JNDI name, it gets hold of a javax.naming.Context instance. This instance is different from the one which was used to do the lookup (jndiCtx in this example). This is an important detail to understand (for reasons explained later). Now this returned instance is used to lookup the rest of the JNDI string ("app/module/distinct/bean!interface"), which then returns the EJB proxy. Irrespective of whether the lookup is done in a single statement or multiple parts, the code works the same. i.e. an instance of javax.naming.Context gets created for the ejb: URL string.

So why am I explaining all this when the section is titled "How to close scoped EJB client contexts"? The reason is because client applications dealing with scoped EJB client contexts which are associated with a JNDI context would expect the following code to close the associated EJB client context, but will be surprised that it won't:

final Properties props = new Properties();
// mark it for scoped EJB client context
props.put("org.jboss.ejb.client.scoped.context","true");
// add other properties
props.put(....);
...
Context jndiCtx = new InitialContext(props);
try {
      final MyBean bean = jndiCtx.lookup("ejb:app/module/distinct/bean!interface");
      bean.doSomething();
} finally {
  jndiCtx.close();
}

Applications expect that the call to jndiCtx.close() will effectively close the EJB client context associated with the JNDI context. That doesn't happen because as explained previously, the javax.naming.Context backing the ejb: URL string is a different instance than the one the code is closing. The JNDI implementation in Java, only just closes the context on which the close was called. As a result, the other javax.naming.Context that backs the ejb: URL string is still not closed, which effectively means that the scoped EJB client context is not closed too which then ultimately means that the connection to the server(s) in the EJB client context are not closed too.

So now let's see how this can be done properly. We know that the ejb: URL string lookup returns us a javax.naming.Context. All we have to do is keep a reference to this instance and close it when we are done with the EJB invocations. So here's how it's going to look:

final Properties props = new Properties();
// mark it for scoped EJB client context
props.put("org.jboss.ejb.client.scoped.context","true");
// add other properties
props.put(....);
...
Context jndiCtx = new InitialContext(props);
Context ejbRootNamingContext = (Context) jndiCtx.lookup("ejb:");
try {
    final MyBean bean = ejbRootNamingContext.lookup("app/module/distinct/bean!interface"); // the rest of the EJB jndi string
    bean.doSomething();
} finally {
    try {
        // close the EJB naming JNDI context
        ejbRootNamingContext.close();
    } catch (Throwable t) {
        // log and ignore
    }
    try {
        // also close our other JNDI context since we are done with it too
        jndiCtx.close();
    } catch (Throwable t) {
        // log and ignore
    }

}

As you see, we changed the code to first do a lookup on just the "ejb:" string to get hold of the EJB naming context and then used that ejbRootNamingContext instance to lookup the rest of the EJB JNDI name to get hold of the EJB proxy. Then when it was time to close the context, we closed the ejbRootNamingContext (as well as the other JNDI context). Closing the ejbRootNamingContext ensures that the scoped EJB client context associated with that JNDI context is closed too. Effectively, this closes the connection(s) to the server(s) within that EJB client context.

For More Details you can refer Scoped EJB client contexts

Snehal Patel
  • 1,282
  • 2
  • 11
  • 25
  • Thank you Snehal! But I've checked our code. We didn't set props.put("org.jboss.ejb.client.scoped.context","true"); So I think we are not using Scoped EJB Client context. But I will add ejbRootNamingContext.close(); to see if we can avoid this problem. – Chihpeng Lin Aug 30 '18 at 09:53
  • @ChihpengLin Let me know in case problem still persist. we faced similar problem while ago and solved with the help of this. – Snehal Patel Aug 30 '18 at 10:44
  • Thanks for your help. Did you notice that number of ** Remoting "endpoint" task** threads when you encountered OutOfMemory problem? When we encountered this problem, there are thousands of ** Remoting "endpoint" task** threads, but all of them are waiting and doing nothing. – Chihpeng Lin Sep 03 '18 at 07:15
  • We encounter similar **Remoting "endpoint" task** thread in thread dump. can you please share your EJB lookup & remote execution code ? – Snehal Patel Sep 04 '18 at 06:11
  • Here is how we create InitialContext and lookup ejb: prop.put(Context.INITIAL_CONTEXT_FACTORY, "org.jboss.naming.remote.client.InitialContextFactory"); prop.put(Context.PROVIDER_URL, "http-remoting://" + ip + ":" + port); prop.put("jboss.naming.client.ejb.context", true); Context ctx = new InitialContext(prop); MyBean bean = (MyBean) ctx.lookup(jndiName); – Chihpeng Lin Sep 04 '18 at 10:11
0

I found that the Remoting "endpoint" task threads are created by javax.management.remote.JMXConnector. We opened some javax.management.remote.JMXConnector to access MBeans in the other servers. But didn't close them. After closing those JMXConnector instances, the threads are gone.

javax.management.remote.JMXConnector is using xnio to communicate with MBeans. It will create a XnioWorker when it is opened, and XnioWorker will create Remoting "endpoint" task threads. So the problem is not caused by EJB.