0

It seems highly implausible, even ridiculous, to me.

I login by SSH to the CentOS server. I FTP to another server and begin a large file transfer out (uploading to other server).

After some time (usually about 3 - 30 seconds), I get kicked off SSH.

I have never had a problem on any other server where I got kicked out of SSH because of FTP bandwidth.

root@my_server [/home/my_folder/public_html]# ftp xx.xxx.xx.xx
Connected to xx.xxx.xx.xx.
220 (vsFTPd 2.3.5)
530 Please login with USER and PASS.
530 Please login with USER and PASS.
KERBEROS_V4 rejected as an authentication type
Name (xx.xxx.xx.xx:root): buttle_butkus
331 Please specify the password.
Password:
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> cd speedtest
250 Directory successfully changed.
ftp> 2gb_file.tar.gz
local: 2gb_file.tar.gz remote: 2gb_file.tar.gz
227 Entering Passive Mode (xx,xxx,xx,xx,xxx,xxx).

Connection closed by host

Disconnected

Server hosting company said that it's because "all of the bandwidth is being used by FTP". I have never heard of such a thing, but before I tell them they're crazy I wanted to ask the experts here.

They also suggested I change the UserBandwidth option in pure-ftpd.conf to 15. That's 15 kilobytes/s on a 100mbps connection. I guess I need to leave about 99.9mbps for SSH??

I did change it, but it didn't affect upload speed anyway (still very fast). Perhaps the setting only affects outside users logging into the server by FTP. I'm logged in by SSH, and FTP-ing out.

thanks.

EDIT: Adding part of tshark captured traffic (ip addresses partially obscured):

544 111.343171 xx.xx.195.90 -> xx.xx.230.39 SSH [TCP Retransmission] Encrypted
response packet len=60
545 117.486826 xx.xx.195.90 -> xx.xx.230.39 SSH [TCP Retransmission] Encrypted
response packet len=60
546 125.464960  xx.xx.94.90 -> xx.xx.195.90 TCP [TCP Previous segment lost] ssh
> csd-monitor [SYN, ACK] Seq=498204672 Ack=2219931860 Win=8192 [TCP CHECKSUM INC
ORRECT] Len=0
547 129.774351 xx.xx.195.90 -> xx.xx.230.39 SSH [TCP Retransmission] Encrypted
response packet len=60
548 154.348760 xx.xx.195.90 -> xx.xx.230.39 SSH [TCP Retransmission] Encrypted
response packet len=60
549 203.497887 xx.xx.195.90 -> xx.xx.230.39 SSH [TCP Retransmission] Encrypted
response packet len=60
550 279.243718 xx.xx.195.90 -> xx.xx.230.39 SSH Encrypted response packet len=9
52
Buttle Butkus
  • 1,741
  • 8
  • 33
  • 45
  • Dumb question of my part. Is there something like 'ServerAliveInterval' configured on your ssh_config file? This could be used like a "heartbeat" to keep the ssh ongoing –  Apr 03 '13 at 02:09
  • 1
    Does sound like bollocks to me. – Tom O'Connor Apr 03 '13 at 10:53
  • It's not bandwidth. Run `tshark -w capture.pcap port 21 or port 22 &` before you run the FTP, then log back in after the failure and run `tshark -r capture.pcap` to play back the actual traffic on those ports. Might give you some clue as to what's happening. – SmallClanger Apr 27 '13 at 01:07
  • Should I put a nohup in front of tshark? – Buttle Butkus Apr 27 '13 at 01:17
  • @SmallClanger I ran the command as you specified and it worked but not sure what to make of the output. Also, tshark kind of messes up the terminal, with a counter overwriting your commands in the terminal so you can't see what you just typed (although after pressing enter, everything runs fine). – Buttle Butkus Apr 27 '13 at 02:14
  • Add it in to your original post. If you can be selective just the area around where connection drops. Look for a packet marked FIN or RST on port 21. – SmallClanger Apr 27 '13 at 07:17
  • @SmallClanger I added it in. I didn't see any FIN or RST, so I just included the last few lines (550 lines total were captured). – Buttle Butkus Apr 28 '13 at 02:15
  • 1
    You can open more than one terminal, you know. – Michael Hampton Apr 28 '13 at 02:52
  • Or you could use the *screen* command, and learn how to enter and exit screens. – Jacob Apr 28 '13 at 03:35
  • @MichaelHampton Actually, not in this case. All terminal sessions get booted. I guess you were suggesting doing that to run the tshark command and have it keep running? I imagine nohup would work. – Buttle Butkus Apr 28 '13 at 04:08
  • @Jacob yes that works but I'm not really looking for a way around the problem (except perhaps temporarily) - the server is behaving weird and I think it might be important to fix that. – Buttle Butkus Apr 28 '13 at 04:10

1 Answers1

0

The solution appears to have been found, but I can't be sure. The server host's customer service is horrendous. But, for now, the problem is no longer occurring. I am able to transfer large files and not get booted from SSH terminal after 20-30s.

They changed the network duplex mode to FULL, from HALF.

Check out this wikipedia article on duplex mismatch: http://en.wikipedia.org/wiki/Duplex_mismatch

According to the article, duplex mismatch causes a huge traffic jam on the network. The server host customer service previously blamed bandwidth, saying that the SSH connection was being dropped because the files I was FTP-ing out were too large. They said that I should pay for more bandwidth to fix the problem. That obviously would not have helped. No matter how large the bandwidth, duplex mismatch would have overwhelmed it. I'm still not entirely clear on why that would kill the terminal session, but probably the traffic jam due to the FTP transfer holds up SSH packets that keep the session alive. But it usually drops within 20-30 seconds, which seems a bit too soon for that.

I suggested to customer service that they screwed up some settings when they installed a 100 mbps port in December ( previously it was 10 mbps). I'm not sure if was occurring before that, since we rarely need to transfer files out of there by FTP. Here is their response:

As per your request we had upgraded this server port manually to 100 mbps on 29th Dec 2012. It was correctly installed. The network mods can be changed at any time using the command "mii-tool". Did you made any changes with your networking setting or did you restarted server forcefully? I see that server had restarted 9 days before. Was that a safe restart attempt? may be that recent restart attempt changed the setting.

Do you have any other server facing this same issue? then just run the following command and make sure your server is running Full duplex mode

***** ethtool eth0 | grep -i duplex


Buttle Butkus
  • 1,741
  • 8
  • 33
  • 45