Skip to content
This repository has been archived by the owner on Sep 26, 2021. It is now read-only.

"subnet sandbox join failed for "10.0.0.0/24": error creating vxlan interface: operation not supported" #2753

Closed
orbitalmedia opened this issue Jan 6, 2016 · 22 comments

Comments

@orbitalmedia
Copy link

I've run into a problem running docker-compose on a swarm which was setup using the "Get started with multi-host networking" article on docker.com , except using the Generic Driver

docker-machine create --driver generic --generic-ip-address $HOST1 --generic-ssh-user root --swarm --swarm-discovery="consul://$(docker-machine ip consul):8500" --engine-opt="cluster-store=consul://$(docker-machine ip consul):8500" --engine-opt="cluster-advertise=eth0:2376" node-b

Everything seems fine for each of the hosts

Configuring swarm...
Checking connection to Docker...
Docker is up and running!

However when I run this command (Using only docker hub listed containers )

docker-compose --x-networking --x-network-driver=overlay up -d

ERROR: Cannot start container 4f55c34c5687bc810aaafd58f22d0a60a118d353bc4209993881265e25d171a8: subnet sandbox join failed for "10.0.0.0/24": error creating vxlan interface: operation not supported

@dgageot
Copy link
Member

dgageot commented Jan 6, 2016

What version of the kernel have you got on the host?
see moby/moby#14145

@orbitalmedia
Copy link
Author

4.1.5-x86_64-linode61

@matevarga
Copy link

Same here, but I'm not using compose, just simply Docker and the overlay network driver. Kernel:
Linux vagrant-ubuntu-trusty-64 4.2.0-23-generic #28~14.04.1-Ubuntu SMP Thu Dec 31 13:40:42 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

After restarting the VM, things work.

@matevarga
Copy link

And same here on Linode.. it is a kernel issue. Even Linode tells you that the box runs 4.x, if you install the official kernel (>=3.16) AND set the linode up to boot from GRUB2, then it will work.

@dgageot
Copy link
Member

dgageot commented Feb 12, 2016

Closing. This is a kernel issue

@dgageot dgageot closed this as completed Feb 12, 2016
@gabrielhao
Copy link

Sorry guys, I have the same issue here. but I don't understand what is the reason. could somebody tell a bit more details?

@matevarga
Copy link

You probably have a kernel that's too old or it's not supported.

@gabrielhao
Copy link

well this how i get confused. the kernel version is not old, but i still get this error.
the kernel version is: 4.4.0-x86_64-linode63, OS is Ubuntu 14.04.
and the docker info:

Containers: 6
 Running: 0
 Paused: 0
 Stopped: 6
Images: 37
Server Version: 1.10.2
Storage Driver: devicemapper
 Pool Name: docker-8:0-65539-pool
 Pool Blocksize: 65.54 kB
 Base Device Size: 107.4 GB
 Backing Filesystem: ext4
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data Space Used: 2.264 GB
 Data Space Total: 107.4 GB
 Data Space Available: 21.93 GB
 Metadata Space Used: 3.138 MB
 Metadata Space Total: 2.147 GB
 Metadata Space Available: 2.144 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 WARNING: Usage of loopback devices is strongly discouraged for production use. Either use `--storage-opt dm.thinpooldev` or use `--storage-opt dm.no_warn_on_loop_devices=true` to suppress this warning.
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.77 (2012-10-15)
Execution Driver: native-0.2
Logging Driver: json-file
Plugins: 
 Volume: local
 Network: overlay bridge null host
Kernel Version: 4.4.0-x86_64-linode63
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 991.7 MiB
Name: localhost
ID: S4LH:BUBO:ZCVX:QDGD:UWAI:GIFD:DL2I:KGP2:HDDL:WLQ3:WPO4:ZB34
WARNING: No swap limit support
Cluster store: consul://xxx.xxx.xxx:8500/network
Cluster advertise: xxx.xxx.xxx:2375

@matevarga
Copy link

Yep, Linode's 4.4 kernel has this problem. Install a signed kernel (like lts-wily) from the Ubuntu repo, make sure that Linode actually boots that kernel (somewhere in the VM settings), then you're good to go.

apt-get install -y linux-signed-generic-lts-wily

Dashboard -> Edit profile -> Kernel -> Grub2

@gabrielhao
Copy link

Thanks a lot. This really solved the problem. :)
but pls allow me to ask, what is the reason behind this? Linode's 4.4 kernel has something different than signed kernel which could cause this problem?
I've read some issues in github, normally this is caused by old kernel < 3.16, but why 4.4 still has this.

@matevarga
Copy link

I don't know, unfortunately.

@PhilLogan
Copy link

Apologies if it's a silly question, but I also have this problem. What entries are needed in /etc/apt/sources.list for the apt-get install to work, as I haven't had any luck getting it to work with the things I've tried.

@PhilLogan
Copy link

never mind - found out about trusty/trusty-updates repositories

@WydD
Copy link

WydD commented Jul 12, 2016

To those who may experience this issue even with recent kernels.
We had the same problem with the kernels given by OVH (with all the right drivers embedded) that dont have the CONFIG_VXLAN set.
So check out the config if you have it and recompile the kernel while making sure that CONFIG_VXLAN and CONFIG_VETH are enabled either as embedded or as a module.

@thehonestcto
Copy link

Thanks @matevarga. Same here today on Linode with CentOS 7. I didn't have to install the kernel myself though, just rebuilt the machines and set them to GRUB 2 in Linode's panel before first starting them.

@tiangolo
Copy link

tiangolo commented Nov 3, 2017

I've been struggling with Docker Swarm in Linode for about 2 days, so, here are my instructions on how to solve it for anyone else that arrives here.

Here are the instructions with screenshots, because I think it's quite easy to get lost in the procedure and I hope others can avoid all the struggle.

  • You start with a standard Linode, click on the "edit" link in the "Linode Profile":

selection_001

  • In the settings, there is a dropdown to select the kernel, by default it has a Linode kernel (that's what causes the problem with Docker Swarm):

selection_002

  • Select the kernel GRUB 2:

selection_003

  • Save the changes:

selection_004

  • Your new profile will say (GRUB 2) at the end. You can now re-start your Linode. There's no need to install anything, re-deploy, etc:

selection_005

  • After rebooting, it should work.

@johnhidey
Copy link

Thanks for the find @tiangolo Worked perfectly

@easyguyme
Copy link

Thanks @tiangolo worked perfectly for me

@bard
Copy link

bard commented Dec 15, 2017

In case anyone follows @tiangolo 's instructions on an older Linode and is left staring with at the Grub prompt fromm Lish, see: https://www.linode.com/docs/tools-reference/custom-kernels-distros/run-a-distribution-supplied-kernel/#older-distributions

@zx1986
Copy link

zx1986 commented Sep 6, 2018

I am facing the https://www.linode.com/docs/platform/manager/how-to-change-your-linodes-kernel/#no-upstream-kernel-installed problem :-(

Ubuntu Server 16.04 LTS

  1. sudo apt update
  2. apt list -a linux-image-generic
  3. sudo apt install linux-image-generic grub2
  4. ls /boot
  5. sudo vim /etc/default/grub
  6. sudo update-grub
GRUB_DEFAULT=0
GRUB_HIDDEN_TIMEOUT_QUIET=true
GRUB_TIMEOUT=10
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
GRUB_CMDLINE_LINUX="console=ttyS0,19200n8 net.ifnames=0"

GRUB_TERMINAL=serial
GRUB_DISABLE_OS_PROBER=true
GRUB_SERIAL_COMMAND="serial --speed=19200 --unit=0 --word=8 --parity=no --stop=1"
GRUB_DISABLE_LINUX_UUID=true
GRUB_GFXPAYLOAD_LINUX=text

References:
https://askubuntu.com/questions/879888/how-do-i-update-kernel-to-the-latest-mainline-version
https://askubuntu.com/questions/119080/how-to-update-kernel-to-the-latest-mainline-version-without-any-distro-upgrade

@OutsourcedGuru
Copy link

Just noting that 10.0.0.0/24 is an invalid subnet. The first valid subnet within the 10.0.0.0/8 (Class A) network, now sliced with a /24 subnet mask is... 10.0.1.0/24. You have to throw away the top/bottom on the network side just like you do for the top/bottom for the host side of that bitmask. For the same reason, 10.255.255.0/24 is also invalid.

For any given subnet mask there are 2x - 2 subnets and 2x - 2 hosts

...where x is the number of bits on that side of the mask. So for /24 that's 24 on the network side and 8 on the host side making 16777214 subnets and 254 hosts. Note the "- 2" part of that calculation on the network side of the bitmask. That means that you have to throw away (you can't issue) those since they mean something to the transport layer of tcp/ip, in this case.

This should make sense to anyone who already knows that you similarly can't bind any 10.x.y.0/24 and 10.x.y.255/24 addresses since they already mean something.

@trajano
Copy link

trajano commented Jun 2, 2020

I've started getting this on 19.03 Docker Swarm

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests