I am doing experiments on my Proxmox server. The purpose of the expriment is to establish reliable and fault-torrant communications between two PCs that control industrial equipment. but I am puzzled by the results of the experiments.
Experiment 1
The nework layout is :
+-------------------------------------------+
| ens21 x | SRV1
| | | 172.16.1.2
| br0 |
| | |
|bond0.10. * - - - + - - - - - * bond0.20|
| | | |
| ens19 x...................x ens20 |
+-------------------------------------------+
| |
vlan10 | | vlan20
| |
+-------------------------------------------+
| eth3.10 x x eth4.20 | SW1
| |
| eth1.10 x x eth2.20 |
+-------------------------------------------+
| |
| |
other |
vlan10 bridges | vlan20
or |
switches |
| |
+-------------------------------------------+
| eth3.10 x x| eth4.20 | SW2
| |
| eth1.10 x eth2 x eth3.20 |
+-------------------------------------------+
| |
vlan10 | | vlan20
| |
+-------------------------------------------+ SRV2
| ens19 ...................x ens20 | 172.16.1.1
| | | |
|bond0.10. * - - - + - - - - - * bond0.20|
| | |
| br0 |
| | |
| ens21 x |
+-------------------------------------------+
Note:
x: NIC
*: Bonding interface
....: Bonding connection
- or | seperated by space: Bridging connection
- SRV1 is a Debian VM. It has there interfaces, ens19 and ens20. ens21 is reserved for other VMs. I bond ens19 and ens20 to bond0. Bond0 is in broadcast mode. Br0 with an IP of 172.16.1.1 is a bridge over bond0.10, bond0.20 and ens21. SRV2 is similar to SRV1. IP of br0 is 172.16.1.2.
Here is the my configuration of SRV1:
auto bond0
iface bond0 inet manual
up ifconfig $IFACE promisc
up ifconfig bond0 0.0.0.0 up
bond-slaves ens19 ens20
#bond-miimon 100
bond-downdelay 200
bond-updelay 200
#arp_interval 100
#arp_ip_target 172.16.1.2
#bond-mode active-backup
bond-mode broadcast
#bond-mode balance-alb
#pre-up echo 100 > /sys/class/net/bond0/bonding/arp_interval
#pre-up echo +172.16.1.2 > /sys/class/net/bond0/bonding/arp_ip_target
auto bond0.10
iface bond0.10 inet manual
#iface bond0.10 inet static
# address 192.168.100.11
# netmask 2558.255.255.0
# vlan-raw_device bond0
auto bond0.20
iface bond0.20 inet manual
#iface bond0.20 inet static
# address 192.168.200.12
# netmask 255.255.255.0
# vlan-raw_device bond0
auto ens21
iface ens21 inet manual
up ifconfig $IFACE promisc
auto br0
iface br0 inet static
#bridge_ports bond0 ens21
bridge_ports bond0.10 bond0.20 ens21
address 172.16.1.1
broadcast 172.16.255.255
netmask 16
bridge_stp off
bridge_fd 0
- SW1 is an Openwrt VM. SW1 hase four ports(eth1~4). I create two bridges: br-lan10 is over eth1.10 and eth3.30, br-lan20 is over eth2.20 and eth4.20. SW2 is similar to SW1.
/etc/config/network on SW1:
config interface 'eth1_10'
option proto 'none'
option ifname 'eth1.10'
option auto '1'
config interface 'eth2_20'
option proto 'none'
option ifname 'eth2.20'
option auto '1'
config interface 'eth3_10'
option proto 'none'
option ifname 'eth3.10'
option auto '1'
config interface 'eth4_20'
option proto 'none'
option ifname 'eth4.20'
option auto '1'
config interface 'lan10'
option proto 'static'
option type 'bridge'
option ifname 'eth1.10 eth3.10'
config interface 'lan20'
option type 'bridge'
option proto 'none'
option auto '1'
option ifname 'eth2.20 eth4.20'
- Between SW1 and SW2, there maybe other VMs acting as switches.
When I ping from SRV1 to SRV2, I get delays about 40ms, and I get no duplicate packets:
root@SRV1:~# ping 172.16.1.2 -c 5
PING 172.16.1.2 (172.16.1.2) 56(84) bytes of data.
64 bytes from 172.16.1.2: icmp_seq=1 ttl=64 time=37.7 ms
64 bytes from 172.16.1.2: icmp_seq=2 ttl=64 time=44.0 ms
64 bytes from 172.16.1.2: icmp_seq=3 ttl=64 time=36.9 ms
64 bytes from 172.16.1.2: icmp_seq=4 ttl=64 time=46.1 ms
64 bytes from 172.16.1.2: icmp_seq=5 ttl=64 time=45.8 ms
--- 172.16.1.2 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 14ms
rtt min/avg/max/mdev = 36.864/42.085/46.071/3.986 ms
I also find that the CPU usage of PROXMOX and SRV1 is almost 98% and 86% respectively. The flow monitored increases rapidly from 4KB to about 120MB.
Experiment 2
I make the following changes:
- On SRV1 and SRV2, the br0 is over ens21 and bond0.
/etc/network/interfaces on SRV1:
auto br0
iface br0 inet static
bridge_ports bond0 ens21
#bridge_ports bond0.10 bond0.20 ens21
address 172.16.1.1
broadcast 172.16.255.255
netmask 16
bridge_stp off
bridge_fd 0
- On SW1, br-lan10 is over eth1 and eth3.10, br-lan20 is over eth3 and eth4.20.
SW2 has similar configuration.
Here is the /etc/config/network on SW1:
config interface 'lan10'
option proto 'static'
option type 'bridge'
option ifname 'eth1 eth3.10'
config interface 'lan20'
option type 'bridge'
option proto 'none'
option auto '1'
option ifname 'eth2 eth4.20
This time, the whole system work fine: I get tripple packets and low latancy:
root@SRV1:~# ping 172.16.1.2 -c 5
PING 172.16.1.2 (172.16.1.2) 56(84) bytes of data.
64 bytes from 172.16.1.2: icmp_seq=1 ttl=64 time=0.989 ms
64 bytes from 172.16.1.2: icmp_seq=1 ttl=64 time=1.00 ms (DUP!)
64 bytes from 172.16.1.2: icmp_seq=1 ttl=64 time=1.05 ms (DUP!)
64 bytes from 172.16.1.2: icmp_seq=1 ttl=64 time=1.06 ms (DUP!)
<Other outputs ommited here>
64 bytes from 172.16.1.2: icmp_seq=5 ttl=64 time=0.825 ms
--- 172.16.1.2 ping statistics ---
5 packets transmitted, 5 received, +12 duplicates, 0% packet loss, time 10ms
rtt min/avg/max/mdev = 0.811/1.022/1.310/0.143 ms
Quesstion
What I expected are before doing these two experiments are:
Experiment 1 —— no broadcast storm will occur, because interfaces and connections from ens19 of SRV1 to ens19 of SRV2 are all in vlan10, while interfaces and connections from ens20 of SRV1 to ens20 of SRV2 are all in vlan20.
Experiment 2 —— there will be broadcast storm, because there is a loop ( ens19@SRV1 -- ens19@SRV2 -- ens20@SRV2 -- ens20@SRV1 -- ens19@SRV1). But I get the opposite result.
Could anyone please tell me why in Experiment 1, the network has broadcast storm; while in Experiment 2, there is no?
Thanks a lot!