It seems to be a weird problem to me. Here is the situation. I set up a site to site environment by openvpn (udp protocol), most servers and workstations can reach the other site but several servers failed. My environment is below,
Site A
Subnet 192.168.11.0/24
Gateway to Internet: 192.168.11.1
OpenVpn server: 192.168.11.211(LAN), 10.0.0.11(tun)
Site B,
Subnet 192.168.1.0/24
Gateway to Internet: 192.168.1.1
OpenVPN server: 192.168.1.211(LAN), 10.0.0.1(tun)
Most servers can reach the other site. e.g. I ran ping from my workstation 192.168.11.103 and I can reach 192.168.1.60
PING 192.168.1.60 (192.168.1.60): 56 data bytes
92 bytes from 192.168.11.1: Redirect Host(New addr: 192.168.11.211)
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst
4 5 00 0054 27df 0 0000 40 01 c4d6 192.168.11.103 192.168.1.60
64 bytes from 192.168.1.60: icmp_seq=0 ttl=62 time=50.677 ms
64 bytes from 192.168.1.60: icmp_seq=1 ttl=62 time=46.558 ms
64 bytes from 192.168.1.60: icmp_seq=2 ttl=62 time=25.199 ms
However, I cannot reach 192.168.1.61
PING 192.168.1.61 (192.168.1.61): 56 data bytes
92 bytes from 192.168.11.1: Redirect Host(New addr: 192.168.11.211)
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst
4 5 00 0054 722c 0 0000 40 01 7a88 192.168.11.103 192.168.1.61
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
I definitely set static route on 192.168.1.61
192.168.11.0 255.255.255.0 192.168.1.211 1
and I can even ping back from 192.168.1.61
>ping 192.168.11.103
正在 Ping 192.168.11.103 具有 32 字节的数据:
来自 192.168.11.103 的回复: 字节=32 时间=114ms TTL=62
来自 192.168.11.103 的回复: 字节=32 时间=102ms TTL=61
来自 192.168.11.103 的回复: 字节=32 时间=102ms TTL=61
来自 192.168.11.103 的回复: 字节=32 时间=23ms TTL=61
Please bear with me that my server runs on a Chinese version OS.
I run trace route on both ends and got the followings
$ traceroute 192.168.1.60 //weird, you can see above that this server can be reached by ping.
traceroute to 192.168.1.60 (192.168.1.60), 64 hops max, 52 byte packets
1 192.168.11.1 (192.168.11.1) 1.931 ms 1.858 ms 2.591 ms
2 10.0.0.1 (10.0.0.1) 25.274 ms 23.538 ms 23.927 ms
3 * * *
4 * * *
5 * * *
6 * * *
//truncated.
while for the server I cannot reach, I got the same response on traceroute.
traceroute 192.168.1.61
traceroute to 192.168.1.61 (192.168.1.61), 64 hops max, 52 byte packets
1 192.168.11.1 (192.168.11.1) 3.193 ms 2.823 ms 1.988 ms
2 10.0.0.1 (10.0.0.1) 24.394 ms * 23.029 ms
3 * * *
4 * * *
5 * * *
6 * * *
//truncated.
I decided to see another server I can reach (64 bytes from 192.168.1.30: icmp_seq=13339 ttl=126 time=65.530 ms
) and got the perfect response.
traceroute 192.168.1.30
traceroute to 192.168.1.30 (192.168.1.30), 64 hops max, 52 byte packets
1 192.168.11.1 (192.168.11.1) 6.746 ms 1.813 ms 2.417 ms
2 10.0.0.1 (10.0.0.1) 22.344 ms 22.521 ms 23.538 ms
3 192.168.1.30 (192.168.1.30) 22.876 ms * 102.062 ms
Cause I can ping back from 192.168.1.61, I also did a tracert and the response is good to me.
>tracert 192.168.11.103
通过最多 30 个跃点跟踪
到 mymachine [192.168.11.103] 的路由:
1 <1 毫秒 <1 毫秒 <1 毫秒 192.168.1.211
2 20 ms 21 ms 20 ms 10.0.0.11
3 23 ms 26 ms 24 ms mymachine [192.168.11.103]
跟踪完成。
As I mentioned above, servers and workstations can reach most serves on the other site but it happened to specific servers. In this case, I think most problems are related to 192.168.1.60 and 192.168.1.61. However, I cannot find any difference on these two servers with others.
I compared the ifconfig and iptables on both openvpn servers, they are almost same (definitely the directions are not equal).
I'd like to seek for ideas to trouble-shoot the problem.
Any advice would be appreciated.
Regards,
Kyle