Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm fails with no ipv4 address on the network interface with defaultroute #1156

Closed
scheuk opened this issue Oct 3, 2018 · 44 comments
Closed
Assignees
Labels
area/ecosystem kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/network Categorizes an issue or PR as relevant to SIG Network.
Milestone

Comments

@scheuk
Copy link

scheuk commented Oct 3, 2018

Is this a BUG REPORT or FEATURE REQUEST?

Choose one: BUG REPORT

Versions

kubeadm version (use kubeadm version):
kubeadm version: &version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:14:39Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Kubernetes version (use kubectl version):
    Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:17:28Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

  • Cloud provider or hardware configuration: Hardware

  • OS (e.g. from /etc/os-release):

VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
  • Kernel (e.g. uname -a):
    Linux node1-lab-a1-01 3.10.0-862.14.4.el7.x86_64 #1 SMP Wed Sep 26 15:12:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

  • Others:
    Our hosts are setup using IPv4 BGP over IPv6 (rfc5549: https://tools.ietf.org/html/rfc5549).
    The host ip address is attached to a loopback address and FRR bgp announces that IP to connected TOR switches (spine and leaf fabric). There is no IPv4 address on the connected interfaces, but I do have a default route that allows access to the world:

default proto bgp metric 20
	nexthop via 169.254.0.1 dev em1 weight 1 onlink
	nexthop via 169.254.0.1 dev em2 weight 1 onlink
10.101.155.0/24 proto bgp metric 20
	nexthop via 169.254.0.1 dev em1 weight 1 onlink
	nexthop via 169.254.0.1 dev em2 weight 1 onlink
10.101.246.0/24 dev em3 proto kernel scope link src 10.101.246.11
169.254.0.0/16 dev em3 scope link metric 1002
169.254.0.0/16 dev em1 scope link metric 1003
169.254.0.0/16 dev em2 scope link metric 1005
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1

What happened?

Having a similar problem as #982

running

# kubeadm config images pull
unable to select an IP from default routes.
# kubeadm config images pull -v 10
I1003 20:56:35.880474   87226 interface.go:360] Looking for default routes with IPv4 addresses
I1003 20:56:35.880550   87226 interface.go:365] Default route transits interface "em1"
I1003 20:56:35.881831   87226 interface.go:174] Interface em1 is up
I1003 20:56:35.881925   87226 interface.go:222] Interface "em1" has 1 addresses :[fe80::266e:96ff:fe5f:7b48/64].
I1003 20:56:35.881957   87226 interface.go:189] Checking addr  fe80::266e:96ff:fe5f:7b48/64.
I1003 20:56:35.881989   87226 interface.go:202] fe80::266e:96ff:fe5f:7b48 is not an IPv4 address
I1003 20:56:35.882027   87226 interface.go:360] Looking for default routes with IPv6 addresses
I1003 20:56:35.882051   87226 interface.go:376] No active IP found by looking at default routes
unable to select an IP from default routes.

Same error happens with doing kubeadm init

What you expected to happen?

I expect kubeadm to work and pull the images or perform an init.
Maybe have a way to specify my hosts's ip address or interface.

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

@neolit123 neolit123 added kind/bug Categorizes issue or PR as related to a bug. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. area/ecosystem labels Oct 3, 2018
@timothysc timothysc added this to the v1.13 milestone Oct 4, 2018
@timothysc timothysc removed the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Oct 4, 2018
@timothysc
Copy link
Member

@bart0sh @kad - Do you have any network setups that are similar to this?

@timothysc timothysc assigned timothysc and unassigned liztio Oct 4, 2018
@timothysc timothysc added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Oct 4, 2018
@timothysc
Copy link
Member

@scheuk what's the local resolvable IPv4 address for that host?
Can you add it to /etc/hosts, and specify to kubeadm config?

/cc @kubernetes/sig-network-bugs

@k8s-ci-robot k8s-ci-robot added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Oct 4, 2018
@rosskukulinski
Copy link

@timothysc do we know if the api-server will work with this kind of environment? I'm just wondering if we get kubeadm to work in this network setup, are we going to run into more problems down the road?

@timothysc
Copy link
Member

If you have a specific ethernet adaptor that wraps the details and you can bind to, or /etc/hosts override I would think it should "just work".

@scheuk
Copy link
Author

scheuk commented Oct 4, 2018

@timothysc
my local host IP sits on lo0:

# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet 10.101.228.11/32 brd 10.101.228.11 scope global lo:0
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever

the 10.101.228.11 ip address

I've spent the morning makingkubeadm init work.
I was able to add this to the config and init worked like a charm.

api:
  advertiseAddress: 10.101.228.11

However I'm still stuck needing a route to perform both the kubeadm config images pull command
as well as kubeadm token create --print-join-command to get the join command to add worker nodes to the cluster.

@scheuk
Copy link
Author

scheuk commented Oct 4, 2018

@timothysc
Can you explain the /etc/hosts override?

I just attempted to add 10.101.228.11 hostname to /etc/hosts.
but kubeadm is still looking at the interface with the route.

@timothysc
Copy link
Member

So this has todo with the specifics of your networking configuration, coupled with the default behavior of the system.

I'm fairly certain if you were to make a bridged adapter to bind the IPv4 address to, vs. being bound to loopback, it would detect properly.

I'm digging right now to see if there is a global override for the nic that works across *.

@timothysc
Copy link
Member

So it would be a combination of --api-advertise-addresses and --hostname-override , I'm not certain all those options percolate to all the subcommands, still digging.

@timothysc
Copy link
Member

@scheuk could you run

kubeadm config images pull --config yourconfig.yaml

with your config file specifying: advertiseAddress

@timothysc
Copy link
Member

You may need to pass in --config for every sub-command b/c the way your network is set up.

@mauilion
Copy link

mauilion commented Oct 4, 2018

you can run kubeadm token create --print-join-command from any host or a pipeline with the --kubeconfig flag.

A possible work around for the images pull thing given the config would be something like:
kubeadm config images list | xargs -n1 -I {} docker pull {}
run prior to init

@scheuk
Copy link
Author

scheuk commented Oct 4, 2018

@timothysc @mauilion

No go on --kubeconfig:

# kubeadm token create --print-join-command --kubeconfig /etc/kubernetes/admin.conf
unable to select an IP from default routes.

also if I try with --config it says you can't combine those two:

# kubeadm token create --print-join-command --config /etc/kubernetes/kubeadm.conf
can not mix '--config' with arguments [print-join-command]

This worked, I'll update my deployment to do this

# kubeadm config images pull --config /etc/kubernetes/kubeadm.conf

@mauilion
Copy link

mauilion commented Oct 4, 2018

I mean that you can use kubeadm token create from a machine with a default route like your laptop as long as it has access to the running apiserver and an admin level kubeconfig

@timothysc
Copy link
Member

@scheuk

# kubeadm token create --print-join-command --config /etc/kubernetes/kubeadm.conf 
can not mix '--config' with arguments [print-join-command]

is a minor bug that we can fix in 1.13

Are you still blocked?

@scheuk
Copy link
Author

scheuk commented Oct 4, 2018

@timothysc

I am still blocked.
We use ansible to perform all these steps and I currently execute
kubeadm token create --print-join-command on the first master to get the command to join the worker nodes to the cluster. However I may be able to temporarily unblock myself by doing what @mauilion says, and setup kubeadm locally (where ansible is run from) to perform that action.

Thanks for all the help so far!

@timothysc
Copy link
Member

We use ansible to perform all these steps and I currently execute
kubeadm token create --print-join-command on the first master to get the command to join the worker nodes to the cluster.

The output of init contains the command you should execute on the other nodes.

@mauilion
Copy link

mauilion commented Oct 4, 2018

This normally done in ansible as the token is short lived and it's easier to capture the join command output from token create rather than init.

@scheuk
Copy link
Author

scheuk commented Oct 4, 2018

Also it's a little bit harder to parse with all the other output and spacing ;)

@timothysc
Copy link
Member

timothysc commented Oct 4, 2018

Also it's a little bit harder to parse with all the other output and spacing ;)

I've done so much sed & awk in my life I'm probably too desensitized ;-)

@scheuk
Copy link
Author

scheuk commented Oct 4, 2018

So my localhost where I would run kubeadm from ansible is a mac.
From the kubeadm install page, it doesn't support mac os x.

@kad
Copy link
Member

kad commented Oct 8, 2018

@kad our network team has quite moved to ipv6 yet, but they have bought into cumulus linux routing on the host using the ipv6 link local addresses and the RFC mentioned above.
here a link to how it works: https://docs.cumulusnetworks.com/display/ROH/Routing+on+the+Host
under the BGP and OSPF Unnumbered Interfaces section.

@scheuk thanks for clarifications. So, to summarize whole picture (to see, if I comprehend your setup completely):

  • your host has on lo interface one (or more?) /32 ipv4 addresses.
  • you don't have unicast ipv6 /128 addresses on lo (similar as for ipv4 scenarios)
  • network interfaces use link-locals on bofh ipv4/ipv6.
  • ospf/bgp used to announce real routes to host and host's /32 or /128 are announced back to TORs.

Is that correct ?

if that is correct above, can you share one more output of ip ro get 8.8.8.8 ?
(instead of 8.8.8.8 you can use any of unicast IP. I'm trying to understand what kernel will use as outgoing source address for default route and routes that you got over ospf/bgp, outside of your cluster IP range).

@kad
Copy link
Member

kad commented Oct 9, 2018

@scheuk you can try patch from #69578 to see, if it works in your setup.
If you need some help, I can provide built binary with that patch applied.

@scheuk
Copy link
Author

scheuk commented Oct 9, 2018

@kad your understanding of our setup is correct.
BGP does announce other routes as well, it's configured to pick up local blackhole routes and announce them, but that is for POD connectivity vs host connectivity.

Here's the output of ip ro get on the host:

# ip ro get 8.8.8.8
8.8.8.8 via 169.254.0.1 dev em2 src 10.101.228.11
    cache

I'll attempt to test the patch from #69578 as well and let you know

@scheuk
Copy link
Author

scheuk commented Oct 9, 2018

@kad can you send me a binary, might take less time then me setting up go/figuring how to add a patch :)

@kad
Copy link
Member

kad commented Oct 9, 2018

@kad can you send me a binary, might take less time then me setting up go/figuring how to add a patch :)

try http://orava.kad.name/kubeadm/kubeadm-69578
This kubeadm is built out of master branch. but minimally it should be ok for trying in your setup.

@scheuk
Copy link
Author

scheuk commented Oct 9, 2018

@kad looking good:

# ./kubeadm-69578 config images pull -v 10
I1009 21:53:05.336396   47234 interface.go:384] Looking for default routes with IPv4 addresses
I1009 21:53:05.336485   47234 interface.go:389] Default route transits interface "em1"
I1009 21:53:05.337591   47234 interface.go:196] Interface em1 is up
I1009 21:53:05.337687   47234 interface.go:244] Interface "em1" has 1 addresses :[fe80::266e:96ff:fe5f:7b48/64].
I1009 21:53:05.337721   47234 interface.go:211] Checking addr  fe80::266e:96ff:fe5f:7b48/64.
I1009 21:53:05.337742   47234 interface.go:224] fe80::266e:96ff:fe5f:7b48 is not an IPv4 address
I1009 21:53:05.337768   47234 interface.go:398] Default route exists for IPv4, but interface "em1" does not have unicast addresses. Checking loopback interface
I1009 21:53:05.338779   47234 interface.go:196] Interface lo is up
I1009 21:53:05.338884   47234 interface.go:244] Interface "lo" has 4 addresses :[127.0.0.1/8 10.101.228.11/32 192.0.2.1/24 ::1/128].
I1009 21:53:05.338918   47234 interface.go:211] Checking addr  127.0.0.1/8.
I1009 21:53:05.338958   47234 interface.go:221] Non-global unicast address found 127.0.0.1
I1009 21:53:05.338977   47234 interface.go:211] Checking addr  10.101.228.11/32.
I1009 21:53:05.338995   47234 interface.go:218] IP found 10.101.228.11
I1009 21:53:05.339025   47234 interface.go:250] Found valid IPv4 address 10.101.228.11 for interface "lo".
I1009 21:53:05.339044   47234 interface.go:404] Found active IP 10.101.228.11 on loopback interface
I1009 21:53:05.339186   47234 version.go:156] fetching Kubernetes version from URL: https://dl.k8s.io/release/stable-1.txt
I1009 21:53:05.687024   47234 feature_gate.go:206] feature gates: &{map[]}

@kad
Copy link
Member

kad commented Oct 9, 2018

good. so, please comment on PR :)

@timothysc
Copy link
Member

/cc @rdodev - regarding cli-arg issue(s).

@timothysc timothysc assigned neolit123 and unassigned timothysc Oct 31, 2018
@neolit123
Copy link
Member

related PR for this is in flight by @kad but reviews are pending:
kubernetes/kubernetes#69578

@timothysc
Copy link
Member

/assign @rdodev

Lets chat in the morning on this one.

@timothysc timothysc assigned liztio and unassigned kad and neolit123 Nov 9, 2018
@rdodev
Copy link

rdodev commented Nov 9, 2018

kubeadm token create --print-join-command --config /etc/kubernetes/kubeadm.conf can not mix '--config' with arguments [print-join-command]

In terms of cli this has already been taken care of @timothysc

https://github.com/kubernetes-csi/driver-registrar/blob/87d0059110a8b4a90a6d2b5a8702dd7f3f270b80/vendor/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/validation/validation.go#L375

@timothysc
Copy link
Member

gr8!

@dancarneiro
Copy link

Just put the master's ip in place of $ (hostname -i)
For exemple:
kubeadm init --apiserver-advertise-address 192.168.1.2

@rijuchatterjee
Copy link

kubeadm init --apiserver-advertise-address even kubeadm join does not work. Has anyone been able to join to a cluster using --apiserver-advertise-address

@asher-lab
Copy link

I fixed mine by not using 127.0.x.x as a --apiserver-advertise-address and --apiserver-cert-extra-sans=$IPADDR

try to use 10.0.0.10

Example:

IPADDR="10.0.0.10"
NODENAME=$(hostname -s)
POD_CIDR="192.168.0.0/16"

sudo kubeadm init --apiserver-advertise-address=$IPADDR  --apiserver-cert-extra-sans=$IPADDR  --pod-network-cidr=$POD_CIDR --node-name $NODENAME --ignore-preflight-errors Swap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ecosystem kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

Successfully merging a pull request may close this issue.