0

Sorry if this question has already been asked. I could not find it, I have this setup :

+---------------------------------------------------------------------------------------------+
|HOST                                                                                         |
|                                                                                             |
| +-------------------------------------------------+                                         |
| | UBUNTU-VM                                       |                                         |
| |                                                 |                                         |
| | +-------------------+                           |                                         |
| | |UBUNTU-LXC         |                           |                   +------------------+  |
| | |       10.0.0.3/24 |  10.0.0.1/24              |                   |OTHER VM          |  |
| | |               eth0-----lxcbr0----------eth0-----------br0----------eth0              |  |
| | |                   |           192.168.100.2/24|  192.168.100.1/24 |192.168.100.3/24  |  |
| | +-------------------+                           |                   +------------------+  |
| +-------------------------------------------------+                                         |
+---------------------------------------------------------------------------------------------+

When I ping 192.168.100.3 from my UBUNTU-LXC, the source IP address is automatically changed to 192.168.100.2 by UBUNTU-VM. It's like having a NAT, whereas I really want my UBUNTU-LXC to talk with it own IP address. Is there any way to do this ?

Edit : these info may be relevant :

  • I am using KVM +libvirt to set up my VMs
  • Here is how I create my interface in UBUNTU-VM

:

<interface type='bridge'>                                                    
  <mac address='52:54:00:cb:aa:74'/>                                         
  <source bridge='br0'/>                                                     
  <model type='e1000'/>                                                      
 <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
</interface>                                                                 
little-dude
  • 183
  • 1
  • 9

2 Answers2

0

Change the network description for the libvirt network to not do nat.

From the VM host run virsh net-list.

Then virsh net-edit the network that the VM lives in to remove the natting.

Matthew Ife
  • 23,357
  • 3
  • 55
  • 72
  • I have only `default` network. I removed the ``, killed my VMs, exited virsh, and restarted everything but that did not help. – little-dude Jun 02 '14 at 21:31
0

Finally solved it. Here is how. Sorry for this veery long post, but I spent a lot of time on this and I think some people might be interested in a detailled solution.

The setup

+---------------------------------------------------------------------------------------------+
|HOST                                                                                         |
|                                                                                             |
| +-------------------------------------------------+                                         |
| | UBUNTU-VM                                       |                                         |
| |                                                 |                                         |
| | +-------------------+                           |                                         |
| | |UBUNTU-LXC         |                           |                   +------------------+  |
| | |       10.0.0.3/24 |  10.0.0.1/24              |                   |OTHER VM          |  |
| | |               eth0-----lxcbr0----------eth0-----------br0----------eth0              |  |
| | |                   |           192.168.100.2/24|  192.168.100.1/24 |192.168.100.3/24  |  |
| | +-------------------+                           |                   +------------------+  |
| +-------------------------------------------------+                                         |
+---------------------------------------------------------------------------------------------+

1. Removing the NAT on UBUNTU-VM

The reason why my packets are egressing UBUNTU-VM with 192.168.100.2 is because of the default iptables rule that is created when I start my container :

root@UBUNTU-VM# iptables -nL -t nat     
Chain PREROUTING (policy ACCEPT)                    
target     prot opt source               destination

Chain INPUT (policy ACCEPT)                         
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)                        
target     prot opt source               destination

Chain POSTROUTING (policy ACCEPT)                   
target     prot opt source               destination
MASQUERADE  all  --  10.0.3.0/24         !10.0.3.0/24 

This rule basically says "if the packet is from subnet 10.0.3.0/24 and the destination is in another subnet, change the source ip". So if I delete this rule, I should be able to ping the outside using my container IP address. Let's remove this rule :

root@UBUNTU-VM# iptables -D POSTROUTING 1 -t nat

Now, if I ping 192.168.100.1 from my LXC container (10.0.3.233) here is what happens :

root@HOST# tcpdump -i br0 -n
12:51:56.174009 IP 10.0.3.233 > 192.168.100.1: ICMP echo request, id 498, seq 1, length 64
12:51:56.174072 ARP, Request who-has 10.0.3.233 tell 192.168.100.1, length 28

ICMP requests are coming from my LXC ip address :) However, br0 seems to be unable to answer.

2. Adding a default route on the HOST

root@HOST# ip route add 10.0.0.0/8 via 192.168.100.2

Now the default gateway for 10.0.0.0/8 subnet is eth0 on UBUNTU-VM. Let's try a ping :

root@HOST# tcpdump -i br0 -n
14:14:33.885982 IP 10.0.3.233 > 192.168.100.1: ICMP echo request, id 660, seq 14, length 64
14:14:34.884054 ARP, Request who-has 10.0.3.233 tell 192.168.100.1, length 28

It still does not work. I have no explanation for this unfortunately. And worst, why is br0 making an ARP request for an IP that is not even in its subnet ? At least, I would expect the ICMP request to be silently ignored, but answering with an ARP request is just weird.

3. Configuring libvirt

3.1. Current config

br0 is a bridge I configured on the host manually, using netctl. In my UBUNTU-VM template I have this :

<interface type='bridge'>                                                    
   <mac address='52:54:00:cb:aa:74'/>                                         
   <source bridge='br0'/>                                                     
   <model type='e1000'/>                                                      
   <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
</interface>     

When UBUNTU-VM is created, kvm (or libvirt ?) creates a veth pair and attach them to the bridge.

root@HOST# brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.fe0000000001       no              vnet1
                                                        vnet2

For some reason, this does not work (edits/comments would be appreciated)

The solution was to configure a routed network instead of just a bridged network.

3.2. Define a network

Create an xml template for your network :

<network>                                                       
  <name>vms</name>                                              
  <uuid>f3e18be1-41fe-4f34-87b4-f279f4a02254</uuid>             
  <forward mode='route'/>                                       
  <bridge name='br0' stp='on' delay='0'/>                       
  <mac address='52:54:00:86:f3:04'/>                            
  <ip address='192.168.100.1' netmask='255.255.255.0'>          
  </ip>                                                         
  <route address='10.0.0.0' prefix='8' gateway='192.168.100.2'/>
</network>                                                      

Note the default route stanza. Then source it and start it

virsh # define vms.xml
virsh # net-start vms

3.3. Edit the VM

The interface should now look like this :

<interface type='network'>                                                   
  <mac address='52:54:00:cb:aa:74'/>                                         
  <source network='vms'/>                                                    
  <model type='e1000'/>                                                      
  <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
</interface>                                                                 

Final test

After restarting the VM and the container, I can finally ping br0 using the LXC container ip :

root@HOST# tcpdump -i br0 -n
14:24:00.349856 IP 10.0.3.233 > 192.168.100.1: ICMP echo request, id 468, seq 16, length 64
14:24:00.349900 IP 192.168.100.1 > 10.0.3.233: ICMP echo reply, id 468, seq 16, length 64

Remaining questions

  • Why this ARP requests in 2. ?
  • Why does my setup does not work unless I let libvirt handle the bridge and the routing itself ? My manual config (creating the bridge with netctl, and add the default route with ip route add) is very similar to what libvirt does : a bridge, with two vnet interfaces attached to it, and a default route... Is libvirt doing some black magic here ?
  • Will I be able to scale the number of containers with this setup (it's is my final goal).

Sources that helped

  • libvirt networking doc
  • I will edit and add the other links when I have enough reputation (it requires 10...)
little-dude
  • 183
  • 1
  • 9
  • Other linkks : [Routed subnets without NAT for libvirt managed virtual machines in Fedora](https://www.berrange.com/posts/2009/12/13/routed-subnets-without-nat-for-libvirt-managed-virtual-machines-in-fedora/) --- [linux networking and containers](https://blog.flameeyes.eu/2010/09/linux-containers-and-networking) --- [giving-dockerlxc-containers-a-routable-ip-address](http://blog.codeaholics.org/2013/giving-dockerlxc-containers-a-routable-ip-address/) : it's actually an alternative solution that uses macvtaps, and which also works but it adds more complexity. – little-dude Jun 03 '14 at 22:14