65

We have a server on Amazon EC2 running SSH is on a standard (22) port.

I placed my public key at the <username>/.ssh/authorized_keys file.

The fun thing is that yesterday it was working great! But today, I don't know what happened! I just can't log in.

ssh -vvvv servername

is stuck on

debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY

I got someone to confirm that my public key is there.

I added a new public key from another computer (windows 7 + putty) and I was able to log in. This other computer with Win7 is on the same LAN which means that the external IP is the same.

My private key works for other servers but not with this.

Jakuje
  • 9,715
  • 2
  • 42
  • 45
bakytn
  • 1,217
  • 5
  • 16
  • 28
  • I generated NEW keys and stored new pubkey..the same thing! ha! – bakytn Dec 08 '10 at 12:48
  • fyi, your problem has nothing to do with pubkey authentication: the DH key exchange (`SSH2_MSG_KEX_DH_GEX_REPLY`) happens much earlier in the connection. – user1686 Dec 08 '10 at 13:46
  • 1
    thank you for information. BTW GUYS, the problem has been resolved by itself. I didn't anything just tried to log in and I was successful. hah – bakytn Dec 08 '10 at 16:01
  • Take a look at: http://serverfault.com/questions/592059/debug1-expecting-ssh2-msg-kex-dh-gex-group/697350#697350 – dgaavl Jun 08 '15 at 10:49
  • Bad network latency? much drops? Its just normal message. – Korjavin Ivan Sep 13 '11 at 06:23
  • probably it is. I now can't reproduce it in any way. So it might from my side. – bakytn Sep 14 '11 at 04:47
  • IMO this question should not be closed. In any case, I think it was just a problem with my VPN. I simply reconnected my Ethernet, restarted my VPN, and it worked. – PJ Brunet Nov 06 '21 at 21:30
  • 1
    FWIW I flagged the question because it's closed and I think it should be opened. I doubt it's an Amazon issue, maybe remove that as a tag. – PJ Brunet Nov 06 '21 at 21:37

9 Answers9

95

Change the network interface MTU to solve it. This is a bug for ubuntu 14.04.

This worked for me:

sudo ip li set mtu 1200 dev wlan0

OR

sudo ifconfig wlan0 mtu 1200

ssh fails to connect to VPN host - hangs at 'expecting SSH2_MSG_KEX_ECDH_REPLY'

shgnInc
  • 1,804
  • 3
  • 22
  • 29
  • 1
    `sudo ip li set mtu 1400 dev eno1` worked for me on Ubuntu 16.04. – Márcio Jul 20 '17 at 17:07
  • Thank you so much. I've been unable to SSH or remote desktop into one particular box for weeks. HTTP works and adjacent machines work fine. I've had to hop from other machines to get in. – duckbrain Sep 30 '19 at 17:20
  • If err: "wlan0" does not exist. use command ```iwconfig``` to get your name. This command also fixed issue for me. – wormhit Mar 16 '20 at 16:09
  • For me it was `sudo ip li set mtu 1500 dev `. `1200` or `1400` didn't work. Besides, the failure was only observed under VPN connection. – schneiderfelipe Jul 31 '20 at 18:41
  • 1
    This trick also worked on 20.04 while connecting to Centos7. Can't believe the bug still exists – NeilWang Oct 01 '20 at 06:57
  • Awesome, I had the same issue to SSH a server from ***Ubuntu 20.04*** and Palo Alto's `globalprotect` through ***4G*** and a ***Palo Alto VPN***, and this solved it. Also, I confirm a MTU of `1200` for me: thank you very much! – xCovelus Dec 23 '20 at 09:04
  • In my case, 1500 mtu works with Suddenlink & TMobile, but with Visible(Verizon) I need 1200 as you recommended. How crazy. – PJ Brunet Jun 08 '21 at 04:08
  • This workaround helped me access a server. Let's see if this bug ever gets resolved, but I guess it's not a high priority issue. – icedwater Mar 23 '22 at 11:16
  • The value 1400 it's not necessarily an "one size fits all", in my case I've retrieved the MTU value(s) from the PC of a colleague that was using different VPN clients and had the connections working correctly. For extra kudos: in Windows you can retrive the MTU values with `netsh interface ipv4 show subinterfaces`. – Lohmar ASHAR Feb 21 '23 at 10:30
39

In my case, I have no permission to lower the MTU size. And manually specifying the Cipher does not work.

I am able to connect after shortening the MACs list by specifying one, e.g.:

ssh -o MACs=hmac-sha2-256 <HOST>
Lacek
  • 518
  • 4
  • 5
  • 1
    I knew it's not gonna be the MTU. If someone messes with the MTU on the server side it can affect network throughput. The problem must be some version difference of OpenSSH and how they prefer certain cyphers and MAC function combinations. – Csaba Toth Apr 15 '19 at 23:24
  • 1
    I had this problem while connecting though zeroteir. This solved it for me without having to change the MTU settings. – Mark Tomlin Dec 26 '20 at 16:33
  • This is great. I was able to add MACs to the .ssh/config file for future connections and everything works nicely now. CsabaToth there definitely is a difference in versions in my case: OpenSSH 8.6 vs 7.4) – Etienne Bruines Apr 26 '21 at 08:35
  • I have permission to do anything on my laptop but the advice with mtu didnt work for me, but this one here worked. Thanks! – MilMike Dec 08 '21 at 09:56
  • This, along with other SSH settings such as Ciphers and KexAlgorithms are great tools when supporting devices like AdTran TA devices that have varied (and often stupid) SSH support, and it may be out of your control to upgrade or change the device's SSH settings. – peelman Dec 28 '22 at 00:17
31

Same exact problem here to access a dedicated server at the online.net datacenter.

Theres no problem after a reboot, no need to change MTU, ssh connection works for 1-3 weeks, then appears this exact same bug , blocking on KEXINIT, no more possible to connect the ssh server.

It could be some kind of sshd bug, but its necessarily triggered by some nework stuff happening after 1-3 weeks, I reproduced this exact problem many times with many different servers on this network, some say it could be related to a cisco bug, possibly related with some DPI options.

That problem never happened with other servers I manage in other datacenters, and that have the exact same distro, config and sshd version .

if you dont want to reboot every 10 days because the datacenter firewalls ( or other network tweaks ) is doing weird stuff :

first connect with one of those client side workarounds :

workaround 1, lowering your local, client side MTU :

ip li set mtu 1400 dev wlan0

( 1400 should be enough but you can try to use lower values if needed )

workaround 2, specifying the chosen cypher for the ssh connection :

ssh -c aes256-gcm@openssh.com host

(or try with any another available cypher )

Both of those client side workarounds made it for me, I could connect and save my uptime; but you want to fix this server-side, forever, so you dont have to ask every client to locally tweak their MTU.

On gentoo i just added :

mtu_eth0="1400"

in /etc/conf.d/net

( same mtu option should be available somewhere in your preferred distro network config file )

I ve set the mtu to 1400, but 1460 is probably enough in most cases.

another helping workaround could be to use the following iptables rules to manage fragmentation :

# /sbin/iptables -I OUTPUT -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu

# /sbin/ip6tables -I OUTPUT -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu

( but I personaly didnt need this one until now )

also note that the symptoms of this problem can also be :

debug1: SSH2_MSG_KEXINIT sent

not just

debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY

edit march 2016 :

  • lowering the mtu to 1400 on the server most always work, but I recently had the case where mtu was already lowered to 1400 on the server and the problem reappeared, and the client also had to lower mtu to 1400.

  • The problem also appeared on web login forms waiting for the page to reload until saying "the server have reset the connection", also fixed after the client set the mtU to 1400.

    related links :

https://bugs.launchpad.net/ubuntu/+source/openssh/+bug/1254085

http://www.held.org.il/blog/2011/05/the-myterious-case-of-broken-ssh-client-connection-reset-by-peer/

https://nowhere.dk/articles/natty-narwhal-problems-connecting-to-servers-behind-cisco-firewalls-using-ssh

https://stackoverflow.com/questions/2419412/ssh-connection-stop-at-debug1-ssh2-msg-kexinit-sent

http://www.1-script.com/forums/ssh/ssh-hang-after-ssh2-msg-kexinit-sent-10616-.htm

http://www.snailbook.com/faq/mtu-mismatch.auto.html

neofutur
  • 667
  • 9
  • 18
  • this can happen especially when you have a very unusual small MTU on client end, f.e. you want to use openvpn on a double-nat network. – Dennis Nolte Dec 21 '15 at 18:12
  • i used default mtu values before having this problem, lowering the mtu was the solution, not the problem. please explain your comment. – neofutur Dec 24 '15 at 01:55
10

I had the same problem after I upgraded my Ubuntu client machine. I solved my problem by reducing the size of the line "Ciphers" in /etc/ssh/ssh_config. It also works if you specify the cypher in the command line (ex: ssh -c username@hostname)

Tip from here:

https://bugs.launchpad.net/ubuntu/+source/openssh/+bug/708493/comments/39

rui
  • 509
  • 5
  • 5
8

Changing the KexAlgorithm worked for me, and might be an option where you don't have the system rights to change MTU settings. This might also be one for the OpenSSH crew to address. e.g.

ssh -o KexAlgorithms=ecdh-sha2-nistp521 fu@bar.com
SimonSC
  • 81
  • 1
  • 1
  • 1
    You can add this configuration to your ~/.ssh/config file as described here: https://www.seei.biz/ssh-fails-to-connect-with-debug1-expecting-ssh2_msg_kex_ecdh_reply/ – rtribaldos Aug 18 '21 at 17:43
2

I started having this issue today, on Windows (ssh distributed with Git) and Ubuntu.

It appears to be a bug on OpenSSH, there is a issue on LauchPad.

It worked for me on Windows forcing the 3des-cbc cipher and the key on Ubuntu.

0

we solved it commenting out the Ciphers line on /etc/ssh/ssh_config

-2

It seems clear the options dialog causes an issue, because I altered the order in which Putty negotiates the key exchange and problem solved.

rfg
  • 1
-4

cmiiw

  • check your ~/.ssh/authorized_keys permission, it should be 600

  • check the on on /var/log/secure, /var/log/messages or /var/log/auth

chocripple
  • 2,109
  • 14
  • 9
  • The `authorized_keys` permission has nothing to do with the error since the client is stuck duing the earlier protocol negitioation. Checking the serverside logs may help, but this line is rather a comment - downvote. – try-catch-finally Jan 25 '16 at 21:44