Unnecessary Java context switches

Question

I have a network of Java Threads (Flow-Based Programming) communicating via fixed-capacity channels - running under WindowsXP. What we expected, based on our experience with "green" threads (non-preemptive), would be that threads would switch context less often (thus reducing CPU time) if the channels were made bigger. However, we found that increasing channel size does not make any difference to the run time. What seems to be happening is that Java decides to switch threads even though channels aren't full or empty (i.e. even though a thread doesn't have to suspend), which costs CPU time for no apparent advantage. Also changing Thread priorities doesn't make any observable difference.

My question is whether there is some way of persuading Java not to make unnecessary context switches, but hold off switching until it is really necessary to switch threads - is there some way of changing Java's dispatching logic? Or is it reacting to something I didn't pay attention to?! Or are there other asynchronism mechanisms, e.g. Thread factories, Runnable(s), maybe even daemons (!). The answer appears to be non-obvious, as so far none of my correspondents has come up with an answer (including most recently two CS profs). Or maybe I'm missing something that's so obvious that people can't imagine my not knowing it...

I've added the send and receive code here - not very elegant, but it seems to work...;-) In case you are wondering, I thought the goLock logic in 'send' might be causing the problem, but removing it temporarily didn't make any difference. I have added the code for send and receive...

public synchronized Packet receive() {

if (isDrained()) {
    return null;
}

while (isEmpty()) {


  try {
    wait();
  } catch (InterruptedException e) {        
    close();
    return null;

  }     

  if (isDrained()) {

    return null;
  }

}

if (isDrained()) {     
  return null;
}
if (isFull()) {
  notifyAll(); // notify other components waiting to send
}
Packet packet = array[receivePtr];
array[receivePtr] = null;
receivePtr = (receivePtr + 1) % array.length;
//notifyAll(); // only needed if it was full
usedSlots--;

packet.setOwner(receiver);

if (null == packet.getContent()) {
  traceFuncs("Received null packet");
} else {
  traceFuncs("Received: " + packet.toString());
}


return packet;

}

synchronized boolean send(final Packet packet, final OutputPort op) {

sender = op.sender;

if (isClosed()) {

  return false;
}

while (isFull()) {


  try {
    wait();
  } catch (InterruptedException e) {        
    indicateOneSenderClosed();       
    return false;
  }

  sender = op.sender;

}

if (isClosed()) {

  return false;
}

try {
  receiver.goLock.lockInterruptibly();
} catch (InterruptedException ex) {
  return false;
}

try {
  packet.clearOwner();
  array[sendPtr] = packet;
  sendPtr = (sendPtr + 1) % array.length;
  usedSlots++; // move this to here
  if (receiver.getStatus() == StatusValues.DORMANT || receiver.getStatus() == StatusValues.NOT_STARTED) {
    receiver.activate(); // start or wake up if necessary
  } else {
    notifyAll(); // notify receiver
    // other components waiting to send to this connection may also get
    // notified,
    // but this is handled by while statement 
  }

  sender = null;
  Component.network.active = true;
} finally {
  receiver.goLock.unlock();

}
return true;

}

thanks for asking! I have been discussing the same question on the Sun forum, and here is my last post on that forum:

Our best guess right now is that this effect results from Windows' scheduling logic.

Microsoft seems to be acknowledging that this area needs some improvement as it is introducing UMS - I quote: "UMS is recommended for applications with high performance requirements that need to efficiently run many threads concurrently on multiprocessor or multicore systems. ... UMS is available starting with 64-bit versions of Windows 7 and Windows Server 2008 R2. This feature is not available on 32-bit versions of Windows." Hopefully, Java will take advantage of UMS in some later release.

Thanks for your help!

What JDK are you using? Try to run the same under Linux server... if it runs better, it is XP. — ReneS, Nov 19 '09 at 19:04
It is worth nothing that Java uses the OSes threads for JSE and JEE. This means you are at the mercy of how your OS operates. For better control of context switching you could use a real time operating system. However you problem may be that your OS is trying to be fair to the threads in your process and you have more thread than you have cores so it has to swap them around. I would suggest you have less thread than cores and it should switch much less. — Peter Lawrey, May 11 '10 at 20:35

ReneS · Answer 1 · 2009-11-19T01:28:14.627

4

Green threads are gone (maybe Solaris supports it still but I doubt that). Additionally Java does not switch threads, the OS does that. The only thing Java does is signalling to the OS, that a thread is idle/waits/blocks by using OS functions. So if your program hits any synchronisation points, does Thread.wait/sleep, it will signal, that it does not need the cpu anymore.

Besides that, the OS maintains time slices and will take away the cpu from a thread, even so it could still run, when other threads wait for the cpu.

Can you publish some more code here?

edited Nov 19 '09 at 01:28

answered Nov 19 '09 at 01:15

ReneS

3,535
2
26
35

Just forgot to mention, memory acquisition is always a moment, when the OS might push your thread off the cpu... – ReneS Nov 19 '09 at 01:29
Is there any way to increase the length of OS time slices? – Paul Morrison Nov 19 '09 at 16:15
Yes, depends on your OS and what mode it is running in. For instance Ubuntu server is running other slices than Ubuntu Desktop. (ask at severfault.com). But try to bind your process to one cpu first and check if it is running better. – ReneS Nov 19 '09 at 19:02

score 1 · Accepted Answer · answered Dec 02 '09 at 15:26

I'm a bit embarrassed - it suddenly occurred to me this afternoon that maybe the network whose performance I was worried about was just too simple, as I only had two processes, and two processors. So Windows may have been trying too hard to keep the processors balanced! So I wondered what would happen if I gave Windows lots of processes.

I set up two networks:

a) 50 Generate components feeding 50 Discard components - i.e. highly parallel network - so that's 100 threads in total

b) 50 Generate components feeding 1 Discard component - i.e. highly "funnelled" network - so that's 51 threads

I ran each one 6 times with a connection capacity of 10, and 6 times with a connection capacity of 100. Every run generated a total of 50 * 20,000 information packets, for a total of 1,000,000 packets, and ran for about 1 minute..

Here are the averages of the 4 cases: a) with connection capacity of 10 - 59.151 secs. a) with connection capacity of 100 - 52.008 secs.

b) with connection capacity of 10 - 76.745 secs. b) with connection capacity of 100 - 60.667 secs.

So it looks like the connection capacity does make a difference! And, it looks like JavaFBP performs reasonably well... I apologize for being a bit hasty - but maybe it made us all think a bit more deeply about multithreading in a multicore machine... ;-)

Apologies again, and thanks to everyone who contributed thoughts on this topic!

score 0 · Answer 3 · answered Nov 19 '09 at 00:27

0

Sorry, if this is total bogus, but I am pretty sure that Java doesn't do green threads anymore since Java 1.1. At least Wikipedia says so, too.

This would limit you to use priorities - but in most of the cases I couldn't achieve observable performance improvements either.

answered Nov 19 '09 at 00:27

Marcel Jackwerth

53,948
9
74
88

No, I agree - I didn't make any progress that way! Is there any alternative to priorities? – Paul Morrison Nov 19 '09 at 01:00

Unnecessary Java context switches

3 Answers3