6

let's say some blocking I/O is done in Java such as a long running db query. Is there in general a way in Java that some Java database driver can tell the JVM scheduler that the call has left the JVM and is now being processed by some external system? The JVM could then assign the thread that served the db query for some other operation until the reply from the db has arrived. This way the blocking db query would effectively become non-blocking.

Just wonder whether this can be done on the JVM in general. I do Java for many years now, but I admittedly don't know what the Java scheduler is doing in such a situation.

OlliP
  • 1,545
  • 11
  • 22

3 Answers3

7

let's say some blocking I/O is done in Java such as a long running db query. Is there in general a way in Java that some Java database driver can tell the JVM scheduler that the call has left the JVM and is now being processed by some external system?

Uh, no. The whole point of threads is that if they block, a different thread can then be scheduled to take over the processor or other resources. You don't want the JVM somehow using the same thread which holds all of the JDBC state, stack frame, variables, etc.. You do want it using the same processor and other system resources for a different threads' tasks. Remember that a thread has relatively low overhead. On modern systems, you can start running into problems when you have 1000s of them in a JVM mostly because they each have allocated a fixed stack space area.

The way we optimize this as programmers is to use multiple threads, thread-pools, database connection pools, etc.. Then as queries block, other threads and queries can be working in parallel to maximize system throughput.

Gray
  • 115,027
  • 24
  • 293
  • 354
  • 2
    Absolutely true, it'd be a horrible, horrible idea - although with some interesting possibilities. So anybody surprised that people actually implemented something quite similar? [Fibers](http://en.wikipedia.org/wiki/Fiber_(computer_science)). Only ever saw them in one of Raymond Chen's articles though. To the best of my knowledge nobody ever did anything like it for Java/the JVM though. – Voo Jan 06 '14 at 15:06
  • 1
    I'm not sure I was able to make the point understood. If a thread is blocked because of some db query, it is lost for the thread pool. So if your thread pool has 10 threads and you started 10 db queries and thereafter forked a new thread that new thread will not be handled until the first db query has returned. That's the point in having a database driver that can yield, see f.ex this question on stackoverflow: http://stackoverflow.com/questions/17953269/go-routine-blocking-the-others-one – OlliP Jan 06 '14 at 15:12
  • @OliverPlow a blocking operation means the thread yields. – Peter Lawrey Jan 06 '14 at 15:16
  • 2
    @Oliver The main point is: No Java doesn't do anything like that and the only way to implement something in that vein would involve lots of additional complexity for not much gain. Asynchronous IO on a higher level (i.e. by the driver/framework) is the way to go here. – Voo Jan 06 '14 at 15:17
  • 2
    If you are worried about threads in your thread pool blocking @OliverPlow then you should use a thread pool that is dynamically sized. Pool size optimization is non-trivial but you can't get away from a thread needed to hold on to its stack and other state. If you were somehow apple to separate the thread from this state you wouldn't have very much left in terms of thread overhead. – Gray Jan 06 '14 at 15:18
  • Well, dynamically sizeable thread pools allow you to have, say, 1000 concurrent db queries instead of only 10 or 100. But they reach their end when you have 10.000 concurrent db queries. – OlliP Jan 06 '14 at 15:44
  • Indeed @OliverPlow. Chances are you are going to reach the end of your database server capabilities long before 10k concurrent queries however. – Gray Jan 06 '14 at 15:47
  • @Gray. This is the point of my initial question. If the thread that served the db query can be detached and used for other things till the result of the db query has arrived, then you have a chance to get out of this lock-in. It just struck my mind that since JDK5 (or maybe later?) Sun introduced JAVA NIO (New I/O). This is what Apache MINA must make use of to achive non-blocking network I/O. I believe Java NIO is "hardwired" into the JVM so there is still no way to access the JVM scheduler as in Go. – OlliP Jan 06 '14 at 15:53
  • Sure @OliverPlow. The issue is that the reason why threads take up so much space is mostly the stack -- and that, unfortunately, is still needed to be stored somewhere so the thread can return to it. If you want to write user-level code using NIO or other mechanisms so that one stack context can handle multiple queries then fine but the JVM does not have a luxury of doing this. – Gray Jan 06 '14 at 15:57
  • @Gray. Okay, that basically answers my inital question :-). That's a pitty, but that seems to be the given situation ... – OlliP Jan 06 '14 at 16:01
2

Unless you have a embedded database, you will be waiting for network IO which in turn could be waiting for disk IO. If you have an embedded database you might still have to wait for disk IO.

The JVM could then assign the thread that served the db query for some other operation until the reply from the db has arrived.

The JVM doesn't have anything to do with thread scheduling in most JVMs, instead it uses native threads and the OS does the real work. When you perform a blocking IO operation the OS can schedule another thread to run on the same CPU.

This way the blocking db query would effectively become non-blocking

Only an operation which returns without waiting is non-blocking. Scheduling a new thread doesn't make an operation non-blocking.

I admittedly don't know what the Java scheduler is doing in such a situation.

Most like because there is no such thing (in most JVMs). ;)

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
0

First, I/O operations are issued from JDBC driver, so if you want to use non-blocking I/O, you need async driver like adbcj.

Second, a thread consumes not so much memory so the economy does not worth efforts.

Alexei Kaigorodov
  • 13,189
  • 1
  • 21
  • 38
  • Well, if you have 5000 threads your system resources may run out even on a very well equiped server. For an application that can be reached through the Internet like ebay, Amazon & Co. even 5000 threads is not sufficient ... In any case the performance loss due to the context switches caused by some thousand threads is considerable. – OlliP Jan 06 '14 at 15:33
  • adbcj is an interesting library. AFAIKS it achieves non-blocking behavior through the use of non-blocking network calls through making use of Apache MINA. Apparently, the JVM must provide some non-blocking network I/O Apache MINA is built upon. – OlliP Jan 06 '14 at 15:36
  • Does you database can handle 5000 simultaneous requests? More likely, throughput stabilizes at the rate of 20-50 parallel requests, so there is no sense to issue more requests at once. This can be done with a separate thread pool with 50 threads. – Alexei Kaigorodov Jan 06 '14 at 16:26
  • Might be, I don't know. DB queries were just used as a sample blocking I/O operation as they are known to be blocking. It is about getting round blocking cals in general. – OlliP Jan 06 '14 at 16:37