Xen networking is tricky

I did a lot of playing around with Xen recently, and getting networking configured to my liking was a chore. Here are two of the problems that I had, and some of what I did about it.

My goal was to “do NAT“, with one domU (call him “router”) acting as the gateway, and a bunch more domUs accessing the Internet via “router”. Simple instructions for an ordinary Linux system. I’ve been playing with VMware ESX Server as well, and ESX only supports ethernet-level virtual networking. For consistency, I decided to continue using Xen’s bridge-based networking (I’ve never had any luck getting NAT configured using the stock Xen NAT scripts anyways). I gave domU router two virtual NICs, one connected to xenbr0 (the default bridge), and another connected to a new bridge I created (call it “isolated”) to connect the other domUs to router.

To connect domU’s first NIC to the outside world, I used the second physical NIC on my server (it has two; I let eth0 be dom0’s connection to the world, and eth1 be router’s connection).

ifconfig eth1 down
ifconfig eth1 hw ether fe:ff:ff:ff:ff:ff
ifconfig eth1 -arp
brctl addif xenbr0 eth1
ifconfig eth1 up

Create the new bridge:

brctl addbr isolated
ifconfig isolated up

I had to recompile the domU kernel to enable the necessary packet mangling, which I did following these instructions for the Perfect Xen 3.1.0 Setup for Debian Etch. Other noteworthy commands are those to use the existing Xen makefiles to configure, make, and install the guest kernels:

make linux-2.6-xenU-config
make linux-2.6-xenU-build
make linux-2.6-xenU-install

After all this, I expected things to work. I tried the obligatory pings, and nothing came back. Nothing, as in, not even a “destination host unreachable”. WTF. I fired up tcpdump in my domUs and started trying to track down the problem. Let’s call the players Guest A, Router, and Remote. The idea was for a ping to go A -> Router -> Remote -> Router -> A. There are four arrows there. It turned out that traffic was getting from A to Router, and Router was mangling the packets correctly, and sending them on to Remote. Remote was replying back to Router, but Router was not sending the packets back to A! I call this problem 3/4 of a network since three of the four necessary network links were working as expected. The problem turned out to be that I still hadn’t enabled sufficient domU kernel options to do NAT correctly. I didn’t keep track of exactly what I changed, but I believe some of the connection tracking options needed to be enabled.

I rebuilt my kernel and tried again. Now, ping works, but TCP and UDP do not. WTMF. This turns out to be a packet header checksum issue with the way Xen’s virtual network devices behave. Basically, no point in computing checksums for virtual network segments since the normal forms of corruption do not apply. A fine optimization, but it makes the TCP and UDP stacks very unhappy. The solution here is to run the command:

ethtool -K eth0 tx off

I believe this disables checksumming. To effect this automatically in Debian, you can manipulate your /etc/network/interfaces file thusly:

iface eth0 inet static
post-up ethtool -K eth0 tx off

The Xen Wiki has an entry about this problem.

It is also helpful to look at the Xen Networking page of the Xen Wiki. I deem the information on that page necessary, but not sufficient to configure networking in Xen to my satisfaction.


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s