Ability to use existing VPC subnets #671

cknowles · 2016-09-15T08:23:50Z

We want to manage our VPC separately so this supports deployment to existing VPC subnets.

Inspired by @eugenetaranov's existing work in #212 since it avoids the manual edit of the generate stack template.

colhom · 2016-09-15T20:02:39Z

@c-knowles give me a few days to digest this PR. I haven't figured out where I stand on kube-aws deploying to existing subnets.

cknowles · 2016-09-17T04:11:24Z

Sure. To use the option, just put the IDs of the subnets in as per the cluster template changes. Are there other places enforcing the options specified are valid as a set? For instance, specifying the subnet IDs without specifying the VPC ID probably isn't an option we'd want to support.

The previous workarounds I've seen mean that kube-aws was still creating the subnets but then the stack template would have to be edited to accommodate the existing route tables.

I had considered #212 and whether to just add that network setup as an option using a typical setup. i.e. build that directly into kube-aws. However, one major advantage for us doing it this way is that we can then share the VPC between different systems which in turn cuts the cost of running several NAT Gateways. So we'd at least need the option to turn off network setup all together.

What I'm wondering now is whether it would be best to split out the network setup into a separate stack inside kube-aws and then provide the user with the option about which one they want and whether to even create one. In the first instance we could support the NAT Gateway scenario and the default all public one that already exists. The network stack could be created first and then the IDs exported from it into the cluster.yaml file. I think this would then simplify the K8S stack setup because it would always take a list of IDs for the VPC no matter which option is chosen by the user.

Only check if subnets have a zone or ID, it seems the simplest way to re-use all the existing validation.

On second thought, availability zone is required to define the scaling group so make sure we always have it.

cknowles · 2016-09-20T10:15:26Z

Having a little trouble checking out how well on latest master updates, it seems to have an issue with controller startup. I thought it was my new config at first but I reverted to the old setup - kube-aws creating the VPC etc - and controller logs are saying this over and over:

Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: Failed to start install-kube-system.service.
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: install-kube-system.service: Unit entered failed state.
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: install-kube-system.service: Failed with result 'exit-code'.
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: install-kube-system.service: Service hold-off time over, scheduling restart.
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: Stopped install-kube-system.service.
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal curl[9476]:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal curl[9476]:                                  Dload  Upload   Total   Spent    Left  Speed
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal curl[9476]: [149B blob data]
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: Starting install-kube-system.service...
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: install-kube-system.service: Control process exited, code=exited status=7
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: Failed to start install-kube-system.service.
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: install-kube-system.service: Unit entered failed state.
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: install-kube-system.service: Failed with result 'exit-code'.
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: install-kube-system.service: Service hold-off time over, scheduling restart.
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: Stopped install-kube-system.service.
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: Starting install-kube-system.service...
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal curl[9479]:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Sep 20 10:10:28 ip-10-0-0-50.eu-west-1.compute.internal curl[9479]:                                  Dload  Upload   Total   Spent    Left  Speed

@colhom which branch is best to go with if I want to ensure stability? I will try another cluster shortly using the latest published kube-aws just to check if it's a more general problem with the OS updates or similar.

EDIT: Latest master seems to work if releaseChannel: alpha is specified in cluster.yaml

cknowles · 2016-09-21T03:08:26Z

Interesting the latest published build of kube-aws (0.8.1) does exactly as above on the controller but eventually seems to recover:

...
Sep 21 03:01:43 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: install-kube-system.service: Control process exited, code=exited status=7
Sep 21 03:01:43 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: Failed to start install-kube-system.service.
Sep 21 03:01:43 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: install-kube-system.service: Unit entered failed state.
Sep 21 03:01:43 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: install-kube-system.service: Failed with result 'exit-code'.
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: install-kube-system.service: Service hold-off time over, scheduling restart.
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: Stopped install-kube-system.service.
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: Starting install-kube-system.service...
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]:                                  Dload  Upload   Total   Spent    Left  Speed
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]: [158B blob data]
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]: {
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]:   "major": "1",
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]:   "minor": "3",
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]:   "gitVersion": "v1.3.4+coreos.0",
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]:   "gitCommit": "be9bf3e842a90537e48361aded2872e389e902e7",
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]:   "gitTreeState": "clean",
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]:   "buildDate": "2016-08-02T00:54:53Z",
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]:   "goVersion": "go1.6.2",
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]:   "compiler": "gc",
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]:   "platform": "linux/amd64"
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal systemd[1]: Started install-kube-system.service.
Sep 21 03:01:44 ip-10-0-0-50.eu-west-1.compute.internal curl[11619]: }

If our workers are in private subnets, we may wish to place the controller in a public subnet to access the dashboard without any tunnelling.

cknowles · 2016-09-27T07:20:50Z

@colhom based on some initial usage of this, perhaps allowing the usage of existing route tables would be a better way to go so we have less chance of conflicting private IPs. We could even support the new cross-stack references as well. Any thoughts on that?

cknowles · 2016-10-07T09:35:24Z

Closing this for now as described in #716, I believe using existing route tables rather than subnets to be a better solution.

Ability to use existing VPC subnets

feab38e

colhom added kind/enhancement component/calico priority/Pmaybe labels Sep 15, 2016

Chris Knowles added 3 commits September 19, 2016 13:38

Reuse CIDR validation

90ee800

Only check if subnets have a zone or ID, it seems the simplest way to re-use all the existing validation.

Only check CIDRs for subnets we will create

82f1084

On second thought, availability zone is required to define the scaling group so make sure we always have it.

Merge remote-tracking branch 'coreos/master'

829f839

Chris Knowles added 3 commits September 21, 2016 12:21

Allow controller to be placed in specified subnet

bebf293

If our workers are in private subnets, we may wish to place the controller in a public subnet to access the dashboard without any tunnelling.

Merge remote-tracking branch 'coreos/master'

b7df06d

Merge remote-tracking branch 'coreos/master'

945a65d

cknowles mentioned this pull request Sep 27, 2016

is allowing ingress traffic from 0.0.0.0/0 to port 22 necessary? #632

Closed

Merge remote-tracking branch 'coreos/master'

d2764c3

cknowles mentioned this pull request Oct 7, 2016

Ability to use existing route tables for controller and workers #716

Closed

cknowles closed this Oct 7, 2016

cknowles mentioned this pull request Nov 11, 2016

Creating cluster with an existing subnet kubernetes-retired/kube-aws#52

Closed

mumoshu mentioned this pull request Dec 6, 2016

KubernetesCluster tag is not applied to existing objects kubernetes-retired/kube-aws#135

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to use existing VPC subnets #671

Ability to use existing VPC subnets #671

cknowles commented Sep 15, 2016 •

edited

colhom commented Sep 15, 2016

cknowles commented Sep 17, 2016

cknowles commented Sep 20, 2016 •

edited

cknowles commented Sep 21, 2016

cknowles commented Sep 27, 2016

cknowles commented Oct 7, 2016

Ability to use existing VPC subnets #671

Ability to use existing VPC subnets #671

Conversation

cknowles commented Sep 15, 2016 • edited

colhom commented Sep 15, 2016

cknowles commented Sep 17, 2016

cknowles commented Sep 20, 2016 • edited

cknowles commented Sep 21, 2016

cknowles commented Sep 27, 2016

cknowles commented Oct 7, 2016

cknowles commented Sep 15, 2016 •

edited

cknowles commented Sep 20, 2016 •

edited