Dec 21 '23

Route contention when running docker and a VPN

Things have improved since the original post. It may have been updating docker or a change to work’s VPN config, but boy was it frustrating before. Both the VPN and docker need to route traffic away from the outside internet and into their own system.

  • The VPN ideally wants to set a default route so everything goes to it, except for a few specific ranges. Exceptions include the IP to the VPN server that the virtual network runs over and maybe IPs on your local network.
  • Docker wants to route a range of addresses internally, so you can communicate with containerized apps.

Firstly, what are these routes?

# What are the routes?
ip route -n

# on windows
route print

You’ll likely see something like this:

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.1.254   0.0.0.0         UG    100    0        0 enp5s0
192.168.1.0     0.0.0.0         255.255.255.0   U     100    0        0 enp5s0

These routes are defaults set up by the system, before a VPN or docker runs. The first is the default route, which says route traffic destined to anywhere (Destination 0.0.0.0, Genmask 0.0.0.0) to the router’s IP on my local network which is 192.168.1.254, found through the physical network device “enp5s0”.

The second says to route communications destined for the local network 192.168.1.* (* due to the 0 in the Genmask/subnet mask, which matches the mask the router assigned this machine) to the default route. At first glance this does nothing since that traffic would go there anyway, but I have a hunch this is allocating that range so that future routes know to avoid it.

Note the Metrics of 100 allows for other routes to override these two with a lower number.

Docker

Docker allows you to “containerize” apps, running them in separate individual environments. This is really nice for repeatability: you can run the same container on a different system and expect it to still work. There are some security arguments too, but it’s no magic bullet.

The problem is that at some point you need to connect to the app running in the container. Typically this is done through networking. In an attempt to make the UX seamless, docker routes IP address ranges from the real network into its virtual bridge network(s) that connect the containers. Ideally the default would be a tiny default range that is rarely used, or even have it ask you. Instead, docker’s default is a giant range, 172.17.0.0/16 to 172.30.0.0/16 which are often already in use over a large VPN.

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
...
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0

Note the Metric of 0 is the highest priority.

After starting a docker container(s) with some networks, a few extra bridge adaptors get created:

172.18.0.0      0.0.0.0         255.255.0.0     U     0      0        0 br-d931e01f7003
172.19.0.0      0.0.0.0         255.255.0.0     U     0      0        0 br-2a456e1bfa07

Removing routes by hand

I’ve seen a few times docker is unable to start, saying there are no more routes available. Typically configuring docker clears this up, but that’s not easy to understand and get right.

I also used to see docker not removing temporary adapters when stopping the containers or the service. That is, you can break your system by starting a container and stopping it and the docker demon won’t fix anything. Clearing up the mess docker made was hard. Honestly a reboot is just easier. Anyway, the manual way is to delete the routes and adapters it made. You could be a chump and delete routes manually , but there can be a lot.

# E.g.
sudo ip route del 172.19.0.0 br-5094d9589bea
sudo route del -net 172.19.0.0 netmask 255.255.0.0 dev br-5094d9589bea
...

A faster shotgun approach is to delete all routes 💪. Be careful with this one as it really is all routes. You’ll need to disconnect and reconnect your loopback/LAN/WiFi/VPN connection(s) right after for the regular default routes to be recreated. This re-creates the default routes.

# Careful. This deletes everything! Don't run unless you're physically at the machine.
sudo ip route flush table 0

# Re-create default routes for loopback and other adapters
ip link set lo down
ip link set lo up
... other adapters

Failing to restart lo results in weird bind: Cannot assign requested address issues from local port forwarding or Error: Failed to find a free local port for dynamic forwarding from vscode.

VPNs

VPNs have a similar issue. They want to route all traffic through to their internal interface. After connecting, you will likely see this:

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         0.0.0.0         0.0.0.0         U     0      0        0 tun0
0.0.0.0         192.168.1.254   0.0.0.0         UG    100    0        0 enp5s0
...
192.168.1.0     0.0.0.0         255.255.255.0   U     100    0        0 enp5s0

This says route all traffic to tun0 with metric 0, skipping the default route for enp5s0 with metric 100. The VPN server is also allowed to push arbitrary additional routes to be added after connecting.

These can end up conflicting with docker’s routes or vice versa, depending on the order added — starting the VPN or docker service first. See Docker not working with a VPN due to network issues.

Tangentially, I’m yet to understand how traffic from the virtual network to 192.168.1.254 doesn’t just route through tun0 again — if you know please comment.

Solutions

The root of the problem seems to be an allocation issue. Docker needs to pick a range of IPs to use, but it can’t know ahead of time which might be suddenly be used by a VPN or other local network. Similarly, the VPN doesn’t know that you may start the docker service after connecting. Really, it’s up to the user to configure what ranges to use.

Personally I think the meta cause of these problems is docker trying to be “automagic” about making networking work. There is some fundamental complexity in what it’s doing and this may just need to be taught.

Configure docker

Edit /etc/docker/daemon.json and add something like the following from stack overflow. I’m still pretty confused by what these actually do. I expect something here tells docker what default ranges to forward. Somewhere it needs a real IP to give its virtual interface to route IPs to. It also needs to assign virtual IPs within its internal network(s). Why does docker need so many bridge adapters 🤷. Reading the docs might help.

{
   "bip": "192.168.1.5/24",
   "fixed-cidr": "192.168.1.5/25",
   "default-address-pools":[
     {"base":"192.168.2.5/24","size":28}
   ]
}

[Edit] maybe use a “bip” of .2.5? I hit yet another conflict with my home network and the above recently. See what I mean, re. frustrating.

   "bip": "192.168.2.5/24",
   "fixed-cidr": "192.168.2.5/25",
There are no comments yet.