Java fat client slow when connecting to localhost, fast with remote

Question

I'm having problems with a (usually) latency bound desktop java application connecting to a custom database server.

While it's working on a remote host (windows XP) it's fast (big form opens in under 2 seconds). When it's running on same host the database is on (using X11vnc and NX) it is very slow (the same form opens in aroung 20 seconds). Server is running SuSE Linux Enterprise Server 10.

What I checked:

iptables are clean (no rules in filter, raw, mangle or nat, all tables on ACCEPT)
routing is normal (just default route and local network)
brtables are not even installed
tc is clean
ping latency to localhost is aroung 0.007ms for both 64 byte and 1500 byte packets, latency to remote host is around 0.8ms
loopback throughput is aroung 500MiB/s (tested with netcat)
different java VMs (both 1.5 and 1.6)

While looking in atop there doesn't seem to be any bottlenecks:

CPU utilisation of the java and database processes is very low (java: ~20%, database <5%), the CPU utilisation is moderate on remote access (around 30% for the database) server is a Quad Core 2.66Ghz, client is a Core 2 Duo 2.33Ghz, system besides that is idle
there are hardly any disk reads/writes during the long query (in total about 5-10 reads)

The only thing that differs between the remote and local run is network utilisation, while the local process pulls data at about 1200kbps, the remote is doing it at about 15Mbps.

I'm currently working on duplicating the problem with my hardware, so any tips on those lines are welcome too.

EDIT: Changing the lo interface MTU from the default 16k to 1500 fixes the issue. The issue had been duplicated on Debain lenny 64bit.

dude i bet its the x11 and vnc. try running it on the machine physically. — PHGamer, Oct 16 '10 at 03:10
Please try to do the following: export the DISPLAY to your machine (it can be any X server running, I use X from cygwin because it's easy and fast to install) and start the application. This is very basic but it can prove the the vnc is not the cause of this. (After installing X in cygwin you have to issue "xhost +" to allow remote connection to your X server). — Paul, Oct 16 '10 at 18:58
The VNC and NX are for 99.99% not the cause of this, as I said, the problem was encountered by first running the application through our custom Java terminal solution (something à la Cytrix XenApp, but integrated with JVM), to rule out this, we ran them through VNC and NX. Moreover the application doesn't draw *anything* on the screen before the form is ready, and the connection is *fast*. — Hubert Kario, Oct 16 '10 at 21:08

DerfK · Answer 1 · 2010-10-16T03:34:00.167

1

All I've got for you are more debugging ideas in no particular order...

is it really loopback that's the problem? Have you tried connecting to the network IP address of the server rather than 127.0.0.1? (this uses the loopback interface on my system, but it should rule out odd DNS issues)
is the system load high during this time (io issues can cause system load to be higher than all the individual processes together would appear to be)?
Check netstat's information for each process, if client's Recv-Q is high, the client process isn't reading from its socket on a regular basis.
Try watching tcpdump -i lo, it might help figure out whether there's an obvious pattern to the packets being transmitted.
Does "vmstat 1" show drastically different behavior for the remote access vs the local access versions (i.e. is holding the dataset in RAM in both the db server and the java client forcing you to swap)?
Try increasing the MTU on the loopback device, mine defaults to 16436. This won't help much if you do lots of little bitty packets. Little bitty packets seems to have problems of their own. I'm not a Java programmer, so I don't know how one would do it but try setting TCP_NODELAY (setsockopt system call) on the connection. This one seems like it has a cargo cult following, but supposedly if the communication is one-way, the client will respond with TCP ACKs more regularly and keep the data flowing.
Another thing to try tweaking: echo 1 > /proc/sys/net/ipv4/tcp_low_latency
While playing with setsockopt, see what happens if you increase the sending and receiving buffers in the client?
You're not using some sort of ancient 2.2 kernel are you? There was apparently some sort of huge loopback bug fixed back in 2.3.x according to the 2.4 TODO
Maybe there's a bug in the client (or in Java), have you run the exact same client on a separate Linux system with the same java runtime?

edited Oct 16 '10 at 03:34

answered Oct 16 '10 at 03:08

DerfK

19,493
2
38
54

1. checked also by IP, both lo and eth 2. system load is low during that time (<1) 3. will have to do that, thanks for suggestion 4. yes, I later thought about doing full tcpdump, loading to wireshark and comparing 5. same as 3. 6. unlikely, it's normal ethernet with 1500 MTU, no jumbo frames 7. same as 3 8. don't think it matters, will have to check 9. no, it's the official SLES10 kernel, quite recent 2.6 10. no, but this is the only system we experience it on (and we have many similar setups, as in terminal service and database on same Linux server) – Hubert Kario Oct 16 '10 at 16:18
I've looked at the app today, there is a little bit over 6000 in the *Send-Q* for the server process. Do I understand correctly, that this would indicate that the application doesn't read from the socket? – Hubert Kario Oct 18 '10 at 17:18
Send-Q is "The count of bytes not acknowledged by the remote host". If the client's Rcvd-Q is also 6000 then the client isn't reading the data. If the client's Rcvd-Q is 0, then the client has read the data but the TCP ACK packet has not been returned to the server for some reason. – DerfK Oct 19 '10 at 01:21
After a bit of debugging it become even weirder: If I set the MTU to 1500 on `lo`, the application is fast, but when it's left at the default 16k it's slow. Packet capture shows that there's a ~40ms delay before sending ACK for packets 4096 bytes in size (DB sends those 6k in two packets, 4k sized and 2k sized), but if the 4k is fragmented it is fast. The application isn't doing anything fancy, just a java Socket wrapped in bufferedInputStream (I tried setting the internal buffer even to 1MiB, without any changes). – Hubert Kario Oct 19 '10 at 19:00
@Hubert This is one of those cases that I would make a startup script set the MTU to 1500 and call it a day. The behavior sounds kind of like broken path MTU discovery (decrease MTU fixes problem, except that broken MTU usually completely blocks the packets), but I can't imagine why it would happen on the loopback interface. Is the machine hosting any kind of VPN tunnel? Did some joker change the loopback interface's IP address to something other than 127.0.0.1? Perhaps whatever is mucking with the packet was configured in iproute2 and only visible with the `ip` or `tc` tools? – DerfK Oct 21 '10 at 01:36
No, the machine has only two interfaces: eth0 and lo. Lo has 127.0.0.1 IP and I checked the routes with `ip route ls`. I don't think there's a problem with MTU discovery, there's just much longer delay before sending ACK for 4096 byte packets (I tried setting the TCP_NODELAY in java app, no effect). Oh and I duplicated the issue in Debian Lenny 64bit. – Hubert Kario Oct 21 '10 at 11:45

score 1 · Answer 2 · answered Oct 16 '10 at 05:07

1

This will sound strange, but it worked a while ago for me.
Check /etc/hosts file to have your hostname in the line with localhost or, add one pointing to 127.0.0.1 or the one that listens to.

answered Oct 16 '10 at 05:07

Paul

1,857
1
11
15

score 0 · Answer 3 · answered Oct 12 '10 at 10:17

0

To eliminate X11vnc and NX as possible causes of delay, I would write a console mode non-GUI Java test program that performs database lookups or test transactions and time that app running on a PC and on the server (e.g. using SSH/Putty to invoke it).

A long shot: I'd also check reverse DNS resolution in case the JDBC drivers are using that (e.g. for logging), if DNS isn't properly configured, the software may conceivably hang up waiting for DNS resolution timeouts. How is the DBMS location configured in the Java app?

answered Oct 12 '10 at 10:17

RedGrittyBrick

3,832
1
17
23

1. X11vnc and XN were used only after a custom terminal solution was found to be slow too. The application doesn't draw anything on the screen while it's fetching data and the connection to is fast. 2. We're not using any kind of JDBC solution, it's a full custom database software, it keeps only a single connection through whole session. – Hubert Kario Oct 12 '10 at 15:19
In that case would try one of – RedGrittyBrick Oct 12 '10 at 15:27
... contacting support for the custom database software (or give fuller details here) or try to create a *minimal* test program that reproduces the problem. If the DBMS has a JDBC driver I would also try to write a test client that uses plain JDBC to see if the problem lies in the non-JDBC comms layer you are using. – RedGrittyBrick Oct 12 '10 at 15:33
Problem is, *I am* this program's support... It's hard to give fuller details here, it's a NoSQL (document-oriented paradigm kind) in-house solution with transaction, checkpoint and schema enforcement (just the type, no primary/foreign key) support. But yes, it looks like we will have to reduce the application to just the query engine and then look at performance, thing is: it's non trivial to do. It really just looks like the round-trip time for the local packets in longer than for the remote packets (!?) and I have no idea what can cause that. – Hubert Kario Oct 15 '10 at 22:00

score 0 · Answer 4 · answered Oct 16 '10 at 07:35

0

Have you tried connecting to 127.0.0.1 instead of localhost? This does several things. Avoiding crazy DNS issues is one of them, but also many clients see "localhost" and then decide not to use the network at all and use some local socket instead. It could be this automatic technology switch that is killing your app. Several programs do this major ones like mysql. Using the localhost loopback address by IP forces them to actually use the network socket.

answered Oct 16 '10 at 07:35

Caleb

11,813
4
36
49

yes, tried connecting to `localhost`, `127.0.0.1` and local ip address, same result – Hubert Kario Oct 16 '10 at 16:04
Hum. How about using your client software on ANOTHER computer connecting back to that one? In other words is this specific to the database server on that machine or a general problem with using the client and server on the same machine? – Caleb Oct 16 '10 at 21:45
It's connected to MTU, when it's large, the application is slow, when MTU is small, the application is fast. – Hubert Kario Oct 19 '10 at 19:09

score 0 · Answer 5 · answered Oct 21 '10 at 05:39

My favorite tool for problems like these is strace. You may be able to simply strace the client and see it pause after doing something (a blocking call like connect, or read). If there is some kind of event loop that obscures the pause you can try to filter those syscalls out or you can turn on timing options to strace and save the entire output in the file. Another trick is to wait just until it "finishes" whatever was holding it up and hit ^C so you can look at what happened to break it out of its stupor.

score 0 · Accepted Answer · answered Nov 16 '10 at 09:59

0

It turned out to be a bug in application. With small MTU, the TCP fragmented the packets and not exposing the bug. With high MTU on loopback the bug manifested itself.

answered Nov 16 '10 at 09:59

Hubert Kario

6,361
6
36
65

Any chance to still elaborate on the actual bug about this? Heard about similar experience related to our implementation. – Jouni Aro Dec 21 '17 at 14:04
@JouniAro sorry, wasn't working on code of that application so I don't remember the details any more – Hubert Kario Jan 01 '18 at 13:41
OK, tx. In our case, the app seems to configure the chunk size to a fixed length and if that's smaller than MTU, slowness occurs. For loopback the MTU is often bigger so it manifests easier with that one. – Jouni Aro Jan 03 '18 at 14:15
@JouniAro in our case it was the exact opposite - with large MTU it was slow and fast with small MTU. AFAICT it looks like a different issue. – Hubert Kario Jan 06 '18 at 11:11
Sorry, I wrote a bit unclear: the same happens here. If MTU is large (larger than the chunk size) it is slow. – Jouni Aro Jan 08 '18 at 08:30

Java fat client slow when connecting to localhost, fast with remote

6 Answers6