5

I have a WG "server" on ubuntu 18.04.6 LTS, hosted in the oracle free tier.

I've installed wireguard using well-known https://github.com/angristan/wireguard-install script. Then I've generated several configs for my desktops, phones, etc. It connects and runs perfectly, but sometimes it just freezes for no reason. There's no connectivity issues or something like that. Logs on client side says something like that on Win desktop:

2022-06-21 03:01:01.845: [TUN] [win] Keypair 17 created for peer 1
2022-06-21 03:01:01.846: [TUN] [win] Sending keepalive packet to peer 1 (SERVER_IP:SERVER_PORT)
2022-06-21 03:03:01.822: [TUN] [win] Sending handshake initiation to peer 1 (SERVER_IP:SERVER_PORT)
2022-06-21 03:03:01.884: [TUN] [win] Receiving handshake response from peer 1 (SERVER_IP:SERVER_PORT)
2022-06-21 03:03:01.884: [TUN] [win] Keypair 16 destroyed for peer 1
2022-06-21 03:03:01.884: [TUN] [win] Keypair 18 created for peer 1
2022-06-21 03:03:01.884: [TUN] [win] Sending keepalive packet to peer 1 (SERVER_IP:SERVER_PORT)
2022-06-21 03:05:02.058: [TUN] [win] Sending handshake initiation to peer 1 (SERVER_IP:SERVER_PORT)
2022-06-21 03:05:02.106: [TUN] [win] Receiving handshake response from peer 1 (SERVER_IP:SERVER_PORT)
2022-06-21 03:05:02.106: [TUN] [win] Keypair 17 destroyed for peer 1
2022-06-21 03:05:02.106: [TUN] [win] Keypair 19 created for peer 1
2022-06-21 03:05:02.106: [TUN] [win] Sending keepalive packet to peer 1 (SERVER_IP:SERVER_PORT)
2022-06-21 03:06:21.302: [TUN] [win] Retrying handshake with peer 1 (SERVER_IP:SERVER_PORT) because we stopped hearing back after 15 seconds
2022-06-21 03:06:21.302: [TUN] [win] Sending handshake initiation to peer 1 (SERVER_IP:SERVER_PORT)
2022-06-21 03:06:26.423: [TUN] [win] Handshake for peer 1 (SERVER_IP:SERVER_PORT) did not complete after 5 seconds, retrying (try 2)
2022-06-21 03:06:26.423: [TUN] [win] Sending handshake initiation to peer 1 (SERVER_IP:SERVER_PORT)
2022-06-21 03:06:31.471: [TUN] [win] Handshake for peer 1 (SERVER_IP:SERVER_PORT) did not complete after 5 seconds, retrying (try 3)
2022-06-21 03:06:31.473: [TUN] [win] Sending handshake initiation to peer 1 (SERVER_IP:SERVER_PORT)
2022-06-21 03:06:36.517: [TUN] [win] Handshake for peer 1 (SERVER_IP:SERVER_PORT) did not complete after 5 seconds, retrying (try 4)

or on iphone:

2022-06-21 21:23:40.061830: [NET] peer(5RLe…eMBc) - Sending keepalive packet
2022-06-21 21:23:55.063406: [NET] peer(5RLe…eMBc) - Sending keepalive packet
2022-06-21 21:24:10.064855: [NET] peer(5RLe…eMBc) - Sending keepalive packet
2022-06-21 21:24:15.581989: [NET] Network change detected with satisfied route and interface order [en0, utun3, pdp_ip0]
2022-06-21 21:24:15.585825: [NET] DNS64: mapped SERVER_IP to itself.
2022-06-21 21:24:15.586117: [NET] peer(5RLe…eMBc) - UAPI: Updating endpoint
2022-06-21 21:24:15.587259: [NET] Routine: receive incoming v4 - stopped
2022-06-21 21:24:15.587273: [NET] Routine: receive incoming v6 - stopped
2022-06-21 21:24:15.587645: [NET] UDP bind has been updated
2022-06-21 21:24:15.587713: [NET] peer(5RLe…eMBc) - Sending keepalive packet
2022-06-21 21:24:15.588106: [NET] Routine: receive incoming v6 - started
2022-06-21 21:24:15.588220: [NET] Routine: receive incoming v4 - started
2022-06-21 21:24:25.367681: [NET] peer(5RLe…eMBc) - Sending handshake initiation
2022-06-21 21:24:29.810482: [NET] peer(5RLe…eMBc) - Retrying handshake because we stopped hearing back after 15 seconds
2022-06-21 21:24:30.442990: [NET] peer(5RLe…eMBc) - Handshake did not complete after 5 seconds, retrying (try 2)
2022-06-21 21:24:30.443269: [NET] peer(5RLe…eMBc) - Sending handshake initiation
2022-06-21 21:24:35.470291: [NET] peer(5RLe…eMBc) - Handshake did not complete after 5 seconds, retrying (try 2)
2022-06-21 21:24:35.470610: [NET] peer(5RLe…eMBc) - Sending handshake initiation
2022-06-21 21:24:40.744565: [NET] peer(5RLe…eMBc) - Handshake did not complete after 5 seconds, retrying (try 2)
2022-06-21 21:24:40.744847: [NET] peer(5RLe…eMBc) - Sending handshake initiation
2022-06-21 21:24:45.466608: [NET] peer(5RLe…eMBc) - Retrying handshake because we stopped hearing back after 15 seconds

If I reconnect WG client, it immediately connects and everything is ok.

Any advices? I tried to experiment with PersistentKeepAlive param (on both sides!) that doesn't change anything.

My server cfg:

[Interface]
Address = 10.66.66.1/24,fd42:42:42::1/64
ListenPort = SERVER_PORT
PrivateKey = M?????Uyg4r3mo=

PostUp = iptables -I FORWARD -i ens3 -o wg0 -j ACCEPT; iptables -I FORWARD -i wg0 -j ACCEPT; iptables -t nat -A POSTROUTING -o ens3 -j MASQUERADE; ip6tables -A FORWARD -i wg0 -j ACCEPT; ip6tables -t nat -A POSTROUTING -o ens3 -j MASQUERADE; sudo iptables -I INPUT -i ens3 -p udp --dport SERVER_PORT -m state --state NEW,ESTABLISHED -j ACCEPT
PostDown = iptables -D FORWARD -i ens3 -o wg0 -j ACCEPT; iptables -D FORWARD -i wg0 -j ACCEPT; iptables -t nat -D POSTROUTING -o ens3 -j MASQUERADE; ip6tables -D FORWARD -i wg0 -j ACCEPT; ip6tables -t nat -D POSTROUTING -o ens3 -j MASQUERADE; sudo iptables -D INPUT -i ens3 -p udp --dport SERVER_PORT -m state --state NEW,ESTABLISHED -j ACCEPT

### Client iphone
[Peer]
PublicKey = 0+V???????4HnM=
PresharedKey = s???????amJCxJyqcE=
AllowedIPs = 10.66.66.2/32,fd42:42:42::2/128

### Client mac
[Peer]
PublicKey = Tet4??????mI=
PresharedKey = Ld???r8=
AllowedIPs = 10.66.66.3/32,fd42:42:42::3/128

My client cfg

[Interface]
PrivateKey = 4Bp????=
Address = 10.66.66.2/32,fd42:42:42::2/128
DNS = 8.8.8.8,1.1.1.1

[Peer]
PublicKey = 5R?????c=
PresharedKey = sY????E=
Endpoint = SERVER_IP:SERVER_PORT
AllowedIPs = 0.0.0.0/0,::/0

some stats

root@oraclevpn:~# wg show all
interface: wg0
  public key: 5R?????c=
  private key: (hidden)
  listening port: SERVER_PORT

peer: 0+?????nM=
  preshared key: (hidden)
  endpoint: 666.666.666.666:11111
  allowed ips: 10.66.66.2/32, fd42:42:42::2/128
  latest handshake: 2 minutes, 2 seconds ago
  transfer: 533.52 MiB received, 5.18 GiB sent
Greg Askew
  • 35,880
  • 5
  • 54
  • 82
yegorov-p
  • 61
  • 1
  • 4

1 Answers1

0

If all your clients are having same issue I'd check a few things on server.

  1. Usual suspect, full system update and upgrade

  2. Time sync issues, and/or set correct time zone

    timedatectl

Should have exact same time (min/sec) as your clients.

  1. Install VM Tools sudo apt install open-vm-tools

  2. Same issue happens at 3am vs 3pm? Maybe Oracle free server is over provisioned and causing this issue.

  3. Run htop and see if any services are using up too many resources on your server. Can you distro upgrade (do-release-upgrade) to Ubuntu 20.04?

Let us know if this helped?

Czar
  • 111
  • 1
  • 1
    I don't think it's a server thing. It would affect all clients together, right? But I gave 5 clients on different platform, and only one or two of them can randomly stop working, other are totally fine. I've updated system and checked time sync - everything is ok – yegorov-p Jun 21 '22 at 18:58