Add support for customization of network topologies #284

mumoshu · 2017-01-26T08:39:19Z

This change allows us to define private and public subnets in the top-level of cluster.yaml to be chosen for worker/controller/etcd nodes and a controller loadbalancer.

Thanks to @neoandroid and @Sasso for submitting the pull request #169 and #227 respectively which had been the basis of this feature.

Let me also add that several resources including subnets, NAT gateways, route tables can now be reused by specifying id or idFromStackOutput.
Thanks to @icereval for his PR #195 to firstly introducing the idea of type Identifier to add support for existing AWS resources in an universal way.

A minimum config utilizing this feature would look like:

clusterName: mycluster
externalDNSName: mycluster.example.com
hostedZoneId: yourhostedzoneid
keyName: yourkeyname
kmsKeyArn: "arn:aws:kms:<region>:<account>:key/<id>"
region: <region>
createRecordSet: true
experimental:
  waitSignal:
    enabled: true
subnets:
- name: private1
  availabilityZone: ap-northeast-1a
  instanceCIDR: "10.0.1.0/24"
  private: true
- name: private2
  availabilityZone: ap-northeast-1c
  instanceCIDR: "10.0.2.0/24"
  private: true
- name: public1
  availabilityZone: ap-northeast-1a
  instanceCIDR: "10.0.3.0/24"
- name: public2
  availabilityZone: ap-northeast-1c
  instanceCIDR: "10.0.4.0/24"
controller:
  subnets:
  - name: private1
  - name: private2
  loadBalancer:
    private: false
etcd:
  subnets:
  - name: private1
  - name: private2
worker:
  subnets:
  - name: public1
  - name: public2

This will create 2 private subnets and 2 public subnets. Private ones are used by etcd and controller nodes and the public ones are used by worker nodes and the controller loadbalancer.

It is flexible enough to differentiate private subnets between etcd and controllers:

subnets:
- name: etcdPrivate1
  availabilityZone: ap-northeast-1a
  instanceCIDR: "10.0.1.0/24"
  private: true
- name: etcdPrivate2
  availabilityZone: ap-northeast-1c
  instanceCIDR: "10.0.2.0/24"
  private: true
- name: controllerPrivate1
  availabilityZone: ap-northeast-1a
  instanceCIDR: "10.0.3.0/24"
  private: true
- name: controllerPrivate2
  availabilityZone: ap-northeast-1c
  instanceCIDR: "10.0.4.0/24"
  private: true
- name: public1
  availabilityZone: ap-northeast-1a
  instanceCIDR: "10.0.5.0/24"
- name: public2
  availabilityZone: ap-northeast-1c
  instanceCIDR: "10.0.6.0/24"
controller:
  subnets:
  - name: controllerPrivate1
  - name: controllerPrivate2
  loadBalancer:
    private: false
etcd:
  subnets:
  - name: etcdPrivate1
  - name: etcdPrivate2
worker:
  subnets:
  - name: public1
  - name: public2

It even support using existing subnets by specifying subnet IDs:

subnets:
- name: private1
  id: subnet-12345a
  availabilityZone: ap-northeast-1a
  private: true
- name: private2
  id: subnet-12345b
  availabilityZone: ap-northeast-1c
  private: true
- name: public1
  id: subnet-12345c
  availabilityZone: ap-northeast-1a
- name: public2
  id: subnet-12345d
  availabilityZone: ap-northeast-1c
controller:
  subnets:
  - name: private1
  - name: private2
  loadBalancer:
    private: false
etcd:
  subnets:
  - name: private1
  - name: private2
worker:
  subnets:
  - name: public1
  - name: public2

Or importing subnet IDs from another cfn stack(s):

subnets:
- name: private1
  idFromStackoutput: myinfrastack-privateSubnet1
  availabilityZone: ap-northeast-1a
  private: true
- name: private2
  idFromStackoutput: myinfrastack-privateSubnet2
  availabilityZone: ap-northeast-1c
  private: true
- name: public1
  idFromStackoutput: myinfrastack-publicSubnet2
  availabilityZone: ap-northeast-1a
- name: public2
  idFromStackoutput: myinfrastack-publicSubnet3
  availabilityZone: ap-northeast-1c
controller:
  subnets:
  - name: private1
  - name: private2
  loadBalancer:
    private: false
etcd:
  subnets:
  - name: private1
  - name: private2
worker:
  subnets:
  - name: public1
  - name: public2

…rker/controller/etcd nodes and a controller loadbalancer

mumoshu · 2017-01-26T08:42:32Z

Let me also add that several resources including subnets, NAT gateways, route tables can now be reused by specifying id or idFromStackOutput.
Thanks to @icereval for his PR #195 to firstly introduced the idea of type Identifier to add support for existing AWS resources in an universal way.

codecov-io · 2017-01-26T08:43:01Z

Codecov Report

Merging #284 into master will increase coverage by -2.45%.

@@            Coverage Diff             @@
##           master     #284      +/-   ##
==========================================
- Coverage   57.53%   55.08%   -2.45%     
==========================================
  Files           6        6              
  Lines        1288     1387      +99     
==========================================
+ Hits          741      764      +23     
- Misses        449      505      +56     
- Partials       98      118      +20

Impacted Files	Coverage Δ
config/config.go	`64.63% <35.24%> (-8.06%)`	❌

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4cefe47...1b75cbe. Read the comment docs.

redbaron · 2017-01-26T11:21:18Z

May I propose slightly different structure, which matches underneath CF structure closer.

as subnet names must be unique,then top level subnets: should be turned into a dictionary
any reference to subnets in a stack-template.json should be done by its name { "Ref": "Subnet0" }
Subnets which have no id fields, are created ~~but~~ by kube-aws
subnets WITH id fields are passed as a typed default params in CF template. You'll get a free validation as a bonus:

"Parameters": {
  "Subnet1": {
     "Type": "AWS::EC2::SubnetId", 
     "Default": "subnet-xxyyzz"
  }
}

Therefore cluster yaml would look something like this:

subnets:
  Subnet0:
    availabilityZone: us-west-2a
    CIDR: 10.10.10.0/24
  Subnet1:
    id: subnet-xxyyzz

To keep backwards compatibility, old 'subnets' key in cluster.yaml can be converted to new one internally based on it's type: if it is a list then convert it into a map with 'Subnet0', 'Subnet1', 'Subnet2'' keys

mumoshu · 2017-01-26T12:10:59Z

@redbaron thanks as always for your comments!

IMHO 1. should be an optional structure for defining subnets because name isn't a required param. I'm considering to extend the feature to have a less verbose way of doing the same setup above with something like:

subnets:
- availabilityZone: ap-northeast-1a
  instanceCIDR: "10.0.1.0/24"
  private: true
- availabilityZone: ap-northeast-1c
  instanceCIDR: "10.0.2.0/24"
  private: true
- availabilityZone: ap-northeast-1a
  instanceCIDR: "10.0.3.0/24"
- availabilityZone: ap-northeast-1c
  instanceCIDR: "10.0.4.0/24"
controller:
  private: true
  loadBalancer:
    private: false
etcd:
  private: true
worker:
  private: false

i.e. we can omit names if not interested. This extension doesn't seem to match well against the subnets as a hash syntax?

I believe 2. and 3. are already implemented as you've proposed and therefore 4. could be implemented on top of this PR easily
1. definitly seems to be the way to go for me too. However, not to make this PR too large, would you mind creating an another issue for it?
If believe 1. and 6. should be impemented both/at once if 1. is necessary

redbaron · 2017-01-26T12:56:57Z

should be an optional structure for defining subnets because name isn't a required param.

That is what I am proposing to change. You always refer to a subnet by name therefore it is explicit what is created where and it also perfectly matches what happens in CF, therefore it simplifies kube-aws code so it doesn't need to translate one representation into another.

By listing subnets explicitly you can have controller and etcd in following setup to pick subnets which might even be different from each other but both private. Or some subnets exist and some - not.

controller:
  private: true 
etcd:
  private: true

If we mandate that every subnet wheter it created by kube-aws or already exist and references by it's ID, then rest of stack-template.json unconditionally use { "Ref": "<subnet_name>" }. It will work for both cases, because CF Ref syntax to reference resources within a CF and to reference input params is conveniently the same.

mumoshu · 2017-01-26T14:03:56Z

Thanks again!

Regarding the latter part of your commet, I guess I already know the convenience and efficiency of utilizing cfn parameters after reading through @icereval's great work. I already introduced a few essences of his work via this PR. Now I'm open for a future PR to achieve what you suggested hence I've suggested to open an another issue dedicated to that 👍

I'll try to sort out all the issues eventually but it is definitely impossible to manage those alone. Merging everyone's works and desires to make it work together while adding tests and refactoring to make it maintainable at least for me already took several days to happen!
Gradual changes made by not only the maintainer but multiple contributors is only the way to sustainably keep improving kube-aws IMHO.
Raising another issues for the problem which can be addressed independently is the first step toward it.
I know contributors like you are already helping kube-aws and me a lot but I have to request for more.
Fire and motion with me!

mumoshu · 2017-01-26T14:12:29Z

By listing subnets explicitly you can have controller and etcd in following setup to pick subnets which might even be different from each other but both private. Or some subnets exist and some - not.

Yes, it is why I made it to allow referencing subnets by names.
However I'd still want to support both cases. The one with subnets whose names are omitted is definitely for demo purpose. The one with explicitly named subnets for production purpose(hence implemented at first in this PR). Thoughts?

redbaron · 2017-01-26T14:45:51Z

The one with explicitly named subnets for production purpose(hence implemented at first in this PR). Thoughts?

my concern about introducing names and not making them keys in a hashmap is that it is naturally can lead to duplicates and you need yet another piece of Go code to validate and report error if there are dupes. If you make all subnet names to be keys in hashmap, then you get uniqueness for free, datastructure itself enforces desired properties. Apart from that it all looks fine by me given the current state of the code.

IN the background I am preparing kube-aws overhaul which takes different approach in the way it translates cluster.yaml into CF template. I'll present it as a separate bracnh for discussion and collaboration once it achieves feature parity.

mumoshu · 2017-01-27T01:44:12Z

@redbaron

my concern about introducing names and not making them keys in a hashmap is that it is naturally can lead to duplicates and you need yet another piece of Go code to validate and report error if there are dupes.

Certainly.

If you make all subnet names to be keys in hashmap, then you get uniqueness for free, datastructure itself enforces desired properties

Definitely. Hmm, this is a bit hard decision for me to make at this stage alone. Please let me leave comments to request for confirmations in the related github issues.
Basically if no one needs:

controller:
  private: true 
etcd:
  private: true

for demo purpose, we can safely switch to the way of utilizing hashes as you've suggested.
Personally I'd also prefer it if possible.

Btw,

Apart from that it all looks fine by me given the current state of the code.

Thanks for the kind words!

I'm also looking forward to see your overhaul work!
Will it includes the support for node pools defined inside the top-level cluster.yaml, which is I'm going to tackle next?

mumoshu · 2017-01-27T03:47:13Z

I've updated the description of this PR to cover the overview of all the changes and improvements made.

cc @neoandroid @icereval @c-knowles @Sasso

whereisaaron · 2017-01-28T01:10:42Z

This all looks cool and flexible @mumoshu, thanks for your work! For my use cases it is considerably more flexibility than I would need. The one capability here I'd definitely find useful though, and will use if this goes ahead, is the ability to deploy private clusters with only the API and Ingress/Service load balancers public.

controller:
  loadBalancer:
    private: false

The cluster.yaml file is small enough that I remake it for every minor release. I don't want to miss some change or new feature, so I init every time and re-do the customization of cluster.yaml. Therefore I don't mind what syntax in cluster.yaml gets changed between minor, x.y releases.

neoandroid · 2017-01-28T12:34:41Z

@whereisaaron you already can achieve that scenario since #169 was merged

mumoshu · 2017-01-29T23:15:21Z

Hi @whereisaaron, thanks for the request!
If I understand correctly, I believe your case mentioned in #284 (comment) can already be achieved after this PR with:

subnets:
- name: private1
  availabilityZone: ap-northeast-1a
  instanceCIDR: "10.0.1.0/24"
  private: true
- name: private2
  availabilityZone: ap-northeast-1c
  instanceCIDR: "10.0.2.0/24"
  private: true
- name: public1
  availabilityZone: ap-northeast-1a
  instanceCIDR: "10.0.3.0/24"
- name: public2
  availabilityZone: ap-northeast-1c
  instanceCIDR: "10.0.4.0/24"
controller:
  subnets:
  - name: private1
  - name: private2
  loadBalancer:
    private: false
   # Setting `loadBalancer.private` to false leads kube-aws to make it an `internet-facing` lb while choosing public subnets for the lb like
   # subnets:
   # - name: public1
   # - name: public2
etcd:
  subnets:
  - name: private1
  - name: private2
worker:
  subnets:
  - name: private1
  - name: private2

As can be seen in the above example, explicitly choosing private subnets for all the subnets keys for worker and controller, etcd while setting loadBalancer.private to false results to your desired setup.

mumoshu · 2017-01-29T23:18:36Z

@whereisaaron Regarding your comment #169 (comment), would it be enough for your use-case to add an option to disable creation of a NAT gateway and a route to it like:

- name: private1
  availabilityZone: ap-northeast-1a
  instanceCIDR: "10.0.1.0/24"
  private: true
  natGateway:
    create: false
- name: private2
  availabilityZone: ap-northeast-1c
  instanceCIDR: "10.0.2.0/24"
  private: true
  natGateway:
    create: false

?

mumoshu · 2017-01-29T23:37:27Z

@whereisaaron Ah, or you'll even need to do something like:

- name: private1
  availabilityZone: ap-northeast-1a
  instanceCIDR: "10.0.1.0/24"
  private: true
  natGateway:
    create: false
  # this, in combination with `natGateway.create=false`, implies that the route table already has a route to an existing NAT gateway
  routeTable:
    id: <id of your route table mentioned in https://github.com/coreos/kube-aws/pull/169#issuecomment-275895730>
- name: private2
  availabilityZone: ap-northeast-1c
  instanceCIDR: "10.0.2.0/24"
  private: true
  natGateway:
    create: false
  # this, in combination with `natGateway.create=false`, implies that the route table already has a route to an existing NAT gateway
  routeTable:
    id: routeTable:
    id: <id of your route table mentioned in https://github.com/coreos/kube-aws/pull/169#issuecomment-275895730>
# public subnets used for the external elb for api
- name: public1
  # snip
- name: public2
  # snip

natGateway.create is not yet implemented but it can be added easily if necessary.

routeTable.id for a subnet is already implemented in this PR but would you prefer the top-level routeTableId to be used as the default value for all the subnets (like before)?

Update:

Instead of:

natGateway:
    create: false

I made it:

natGateway:
    preconfigured: true

so that the zero-value(=false) can be the default value for the setting (which is now preconfigured) and therefore the implementation is a bit cleaner.

mumoshu · 2017-01-30T01:18:11Z

Replying to my own comment above.

would you prefer the top-level routeTableId to be used as the default value for all the subnets (like before)?

This should not be allowed because then we don't have a clear way to differentiate public/private route tables;

public route table(s) should be provided to/managed by kube-aws to put external kubernetes api loadbalancer
private route table(s) should be provided/managed by kube-aws to put private worker/etcd/controller nodes

as implied by what @whereisaaron mentioned in #284 (comment)

cknowles · 2017-01-30T02:51:01Z

@mumoshu as per your comment, I think the above looks fine. My understanding of this change is that it adds more flexibility to subnet customisation. Currently using a slightly older 0.9.2 release it is possible to change the route table associated with subnets generated by kube-aws and therefore possible to make them private/public. I'm not sure if #169 removed that and it's re-added here or just here adds more flexibility, either way is fine. Still a bit concerned about struggling to ensure all these different options are all well tested but I don't know how we can deal with that other than cutting the options and perhaps providing some network CF examples out of the box.

whereisaaron · 2017-01-30T03:58:29Z

Thanks for looking in this @mumoshu. It is no problem to specify the route table per subnet, so if routeTable.id is already in place for subnets then I can still ensure our region-level or VPC-level NAT instances are used, rather than these new per-deployment ones. That's great 🎉

So I think the only missing piece to retain the previous capabilities is to have some way to not have any NAT instances created, since they serve no purpose if you have specified routeTable.id and workerSecurityGroupIds for VPC-level or region-level NAT instances. If the proposed natGateway.create option would do that, then we should be sorted.

One question about I think the proposed natGateway.create; I think controller, etcd, and worker subnets all use the same NAT instances. So perhaps the natGateway.create is effectively only a global/top-level yes/no choice rather than per-subnet?

@c-knowles creating private subnets with a specified route table is still possible in v0.9.3-rc.5 and it looks like this new way of doing things still retains or re-adds that capability via routeTable.id. The change, as I understand it, is that enabling private subnets now also triggers the creation of NAT instances. That's a great default for people creating a VPC and a first cluster. But if you have a few clusters you can't afford to create VPCs and/or NAT instances for every one, since the AWS limit is only 5 (in both cases).

@c-knowles as soon as there is an alpha/beta/rc release I'll get on with testing private subnet deployments and report back.

mumoshu · 2017-01-30T07:55:14Z

@c-knowles @whereisaaron Thanks again!

The intention of this PR is to provide enough flexibility to e.g. create worker/etcd/controller nodes in private/public subnet(s) and create an api load balancer in private/public subnet(s) plus reusing existing AWS resources including:

NAT gateways
subnets
route tables

where necessary but I began to believe that I had unintentionally broken your use-cases(=everything in private subnets with a smaller config?) not in functionality but in configuration syntax.
It isn't intended at least for now.

You had been using a configuration like the below to put all the nodes and lbs to private subnets i.e. nothing including nodes and lbs in public subnets, right?

routeTableId: rtb-for-privatesubnet-withpreconfigurednat
mapPublicIPs: false
controllerLoadBalancerPrivate: true

If so, I'm considering to improve this PR so that the above is translated to something like:

subnets:
- private: true
  natGateway:
    preconfigured: true
  routeTable:
    id: rtb-for-privatesubnet-withpreconfigurednat
controller:
  loadBalancer:
    private: true

// However, to be honest, such translation could be soon deprecated and removed if it turns out to be too hard to maintain. Sorry!

Does it make sense to you two?

…preconfigured NAT gateway See kubernetes-retired#284 (comment) for more context

mumoshu · 2017-01-30T08:20:21Z

@whereisaaron I've implemented natGateway.preconfigured instead of natGateway.create with the bool value inverted. Please see my updated comment at #284 (comment) for more explanation!

So I think the only missing piece to retain the previous capabilities is to have some way to not have any NAT instances created, since they serve no purpose if you have specified routeTable.id and workerSecurityGroupIds for VPC-level or region-level NAT instances. If the proposed natGateway.create option would do that, then we should be sorted.

Yes, I think so too and it is addressed in my last commit 242783d which adds subnets[].natGateway.preconfigured: true which requires subnets[].routeTable.id/idFromStackOutput to be specified.
Did I follow you correctly?

mumoshu · 2017-01-30T08:35:19Z

@c-knowles @whereisaaron I've updated my comment #284 (comment) several times. Please re-read if you came from email notifications from github 😃

mumoshu · 2017-01-30T08:42:42Z

@whereisaaron

One question about I think the proposed natGateway.create; I think controller, etcd, and worker subnets all use the same NAT instances. So perhaps the natGateway.create is effectively only a global/top-level yes/no choice rather than per-subnet?

My assumption is that: one or more subnet(s) may rely on an NGW created by kube-aws and others may rely on an NGW preconfigured for an existing subnet(s). Providing the flexibility to customize ngw per subnet also support it.

However I believe that it is true in some aspects that a setting like natGateway.preconfigured should optionally be able to be set at the top-level of cluster.yaml so that one doesn't need to write a verbose configuration like #284 (comment) in some cases hence my prev comment #284 (comment)

cknowles · 2017-01-30T08:53:28Z

@whereisaaron @mumoshu thanks, it seems my use case is still accommodated. Specifically, the use case is detailed in #44. Summary:

Existing VPC to share NATs and also connect through to some other components inside the same VPC such as RDS deployments. Shared NATs is mainly for cost reasons as not much benefit to have separate managed NATs for different clusters we have.
Private subnets for workers due to best practice (non of the services need direct exposure). Ideally generated by kube-aws as we have a few k8s clusters and adding/removing them from the VPC separately seems to be the wrong place.
Controller nodes and ELBs are all still in public subnets (but later I'd like the controllers in private subnets too).

So in turn we end up with the solution for kube-aws to create the subnets and link to existing route tables which are easily defined in a shared VPC (some private with NAT and some public with IGW).

YAML setup looks a bit like this:

subnets:
  - availabilityZone: eu-west-1a
    instanceCIDR: "10.0.1.0/24"
    # private route table to NAT AZ a
    routeTableId: rtb-ID1
  - availabilityZone: eu-west-1b
    instanceCIDR: "10.0.2.0/24"
    # private route table to NAT AZ b
    routeTableId: rtb-ID2
  - availabilityZone: eu-west-1c
    instanceCIDR: "10.0.3.0/24"
    # private route table to NAT AZ c
    routeTableId: rtb-ID3
controllerIP: 10.0.4.50
controllerSubnets:
  - availabilityZone: eu-west-1a
    instanceCIDR: "10.0.4.0/24"
    # public route table
    routeTableId: rtb-ID4
  - availabilityZone: eu-west-1b
    instanceCIDR: "10.0.5.0/24"
    # public route table
    routeTableId: rtb-ID4
  - availabilityZone: eu-west-1c
    instanceCIDR: "10.0.6.0/24"
    # public route table
    routeTableId: rtb-ID4

We have a similar setup with node pools as well, the config is essentially the same. @mumoshu I'm happy for you to break config compatibility to keep kube-aws simple, it's trivial to move config around and we could even write a small version migration script.

mumoshu · 2017-01-30T08:54:08Z

@c-knowles @whereisaaron To sync up, let me confirm that you've never tried to do something like:

routeTableId: rtb-for-privatesubnet-withpreconfigurednat
mapPublicIPs: false
# !!!!!
controllerLoadBalancerPrivate: false

which is IMHO not recommended because it shouldn't work if you configured your existing AWS resources properly.

My reasoning here is that:

mapPublicIPs: false means you wanted to kube-aws to create private subnets for all the nodes while
controllerLoadBalancerPrivate: false means you also wanted it to create a k8s api ELB in public subnets.
However public subnets required by an external ELB aren't created by kube-aws because you've set mapPublicIPs: false. Am I correct?

Anyways, fyi, this use-case is intended to be newly supported via this PR with configuration like explained in #284 (comment).

mumoshu · 2017-01-30T09:04:07Z

@c-knowles Thanks as always!
Just a quick response to #284 (comment) but what is and its intention of controllerIP: 10.0.4.50?

mumoshu · 2017-01-30T09:05:52Z

@c-knowles @whereisaaron And thanks for the kind words about following breaking changes in configuration syntax! It will definitely accelerate the development of kube-aws.

cknowles · 2017-01-30T09:07:58Z

@mumoshu for your first comment, I'm not using private controller ELBs at all right now although if that's the preferred way once I add in a bastion host then I will use it (happy to go off best practice). For your second comment, I believe there was previously a bug in kube-aws which meant controllerIP had to be specified if we customised the subnets so in turn I preferred to be a bit more verbose in the config to avoid further issues, I think that goes away with HA controllers.

whereisaaron · 2017-01-30T17:25:08Z

Wow, thanks for the quick work to add natGateway.preconfigured! As soon as there is a test release I'll recreate some test clusters with the new syntax. It sounds like the existing use case will be covered again now 🎉

This is my current successful use case under 0.9.3. Same settings for the cluster and all node-pools.

vpcId: vpc-12345678
routeTableId: rtb-12345678
mapPublicIPs: false
workerSecurityGroupIds:
  - sg-12345678

I deploy everything with private subnets and then get k8s to create public ELBs for Services and Ingress Controllers that should be exposed. Kubernetes specifically supports private subnet clusters with public ELBs with the kubernetes.io/role/elb subnet tag, which you apply to the (possibly empty) public subnets you want to use for public ELBs. You tag the public subnet you want to use for ELBs in each relevant AZ with:

Tag	Value
KubernetesCluster	your-cluster-name
kubernetes.io/role/elb

When Kubernetes is creating a public ELB, it collects all the public subnets associated with the cluster (tagged KubernetesCluster: cluster-name) and picks the first one in each AZ tagged kubernetes.io/role/elb, or if none are tagged kubernetes.io/role/elb it just picks the first public subnet. (It determines 'public' subnets by checking the route table for an 'igw*' entry.) The relevant code is here.

I keep the controller API ELB private, but you can certainly have controllers in private subnets but with a public API ELB. You (or kube-aws) just needs to create one public subnet (or one per AZ) for the ELB.

For mapPublicIPs: false deployments with private subnets, kube-aws could leverage the built-in k8s support for this by creating and tagging an empty public subnet (per AZ) for public LoadBalancer Services.

…ublicIPs, routeTableId to let kube-aws create all the nodes and the api load-balancer inside either private or public subnets

mumoshu · 2017-01-31T04:12:27Z

@whereisaaron @c-knowles To make your migration path extra clear, I've added an additional commit with more validations and backward-compatibility with the older config syntax based on my idea in #284 (comment)

Specifically, I guess your (potential) use-case is covered by this test 80885cb#diff-4fd4a4a9a3755708c6909f7f824f5754R207
Would you mind taking a look into it?
Is migration path clear enough for you? Do I understand your use-case correctly?

// Please forgive me if I'm being too nervous here but I don't really like to break existing use-cases of yours!

whereisaaron · 2017-01-31T05:42:03Z

Thanks @mumoshu as I read it that does cover the two use cases and seems pretty clean.

If you specify mapPublicIPs: false and NATGateway.preconfigured: true you must also specify routeTableId: rtb-12345678 and it is expected, and the user's responsibility that that route table will provide a NAT gateway. kube-aws will apply your route table to subnets but create no NAT gateways.
If you specify mapPublicIPs: true and NATGateway.preconfigured: true you must also specify routeTableId: rtb-12345678 and it is expected, and the user's responsibility that that route table will provide an Internet gateway. kube-aws will apply your route table to subnets but create no Internet gateways.

Thank you very much for keeping these use cases in play, even though you want to deprecate them 😢 . I would have thought the use case where kube-aws users already use AWS, and already have a working VPC with NAT would be fairly large group, and I'm surprised you plan to drop them.

In the use case where the user has no VPC and kube-aws is creating one, then creating the NAT instances for mapPublicIPs: false is a great new feature for the first install. However, mandating every subsequent install creates yet more NAT gateways seems onerous to me.

~~So I hope you'll consider keeping a use case where the user has an existing AWS VPC environment to deploy clusters into! 🙏 🤞~~

mumoshu · 2017-01-31T06:05:23Z

Thanks as always @whereisaaron 🙇
Sorry for the confusion if I wasn't very clear about it but I'm intended to deprecate confusing configuration syntax(es) but not your use-case(s)!

For example, I'm wondering if we could deprecate mapPublicIPs: true in favor of the ideal allSubnetsPublic: true and mapPublicIPs: false in favor of the ideal allSubnetsPrivate: true.
Then, when both of the ideal flags are set to false, we can be sure that an user does want to mix private/public subnets hence more validation(s) to save the user from making mistakes.

mapPublicIPs as of today can't be used like that as its default value is true. What I coded in this PR is that mapPublicIPs: true(which is the default) combined with subnets[].private: true results in private subnets! It's just confusing to be explained by not only me but everyone, isn't it?

Once again,

I would have thought the use case where kube-aws users already use AWS, and already have a working VPC with NAT would be fairly large group, and I'm surprised you plan to drop them.

I'm not intended to break that use-case!

whereisaaron · 2017-01-31T06:21:19Z

Oh yeah, I totally misunderstood! Sorry @mumoshu! 💩
// DEPRECATED AND REMOVED IN THE FUTURE refers to the syntax rather than the use case. I am certainly happy adjust to any new syntax that makes the code or the configuration easier, as often as necessary.
Thanks for you efforts!!

….preconfigured completely and induce these from other settings like before

… subnets per AZ

…netes-retired#284

mumoshu · 2017-02-01T06:45:10Z

@whereisaaron FYI, for simplicity of cluster.yaml, I've dropped the natGateway.preconfigured while retaining the support for your use-case!
Please read the newly added comments for subnets: key in the updated cluster.yaml to find your use-case and appropriate configuration.

mumoshu · 2017-02-01T06:48:59Z

Although important parts are covered by test/integration/maincluster_test.go and tested automatically by passing through real AWS service(=cfn stack validations), it's impossible to test all the use-cases myself.
I'd appreciate it if you could test yours by your own and report back if there's any issue 🙇

@redbaron

This is an implementation of kubernetes-retired#238 from @redbaron especially what I've described in my comment there kubernetes-retired#238 (comment), and an answer to the request "**3. Node pools should be more tightly integrated**" of kubernetes-retired#271 from @Sasso . I believe this also achieves what was requested by @andrejvanderzee in kubernetes-retired#176 (comment). After applying this change: 1. All the `kube-aws node-pools` sub-commands are dropped 2. You can now bring up a main cluster and one or more node pools at once with `kube-aws up` 3. You can now update all the sub-clusters including a main cluster and node pool(s) by running `kube-aws update` 4. You can now destroy all the AWS resources spanning main and node pools at once with `kube-aws destroy` 5. You can configure node pools by defining a `worker.nodePools` array in cluster.yaml` 6. `workerCount` is dropped. Please migrate to `worker.nodePools[].count` 7. `node-pools/` and hence `node-pools/<node pool name>` directories, `cluster.yaml`, `stack-template.json`, `user-data/cloud-config-worker` for each node pool are dropped. 8. A typical local file tree would now look like: - `cluster.yaml` - `stack-templates/` (generated on `kube-aws render`) - `root.json.tmpl` - `control-plane.json.tmpl` - `node-pool.json.tmpl` - `userdata/` - `cloud-config-worker` - `cloud-config-controller` - `cloud-config-etcd` - `credentials/` - *.pem(generated on `kube-aws render`) - *.pem.enc(generated on `kube-aws validate` or `kube-aws up`) - `exported/` (generated on `kube-aws up --export --s3-uri <s3uri>`) - `stacks/` - `control-plane/` - `stack.json` - `user-data-controller` - `<node pool name = stack name>/` - `stack.json` - `user-data-worker` 9. A typical object tree in S3 would now look like: - `<bucket and directory from s3URI>`/ - kube-aws/ - clusters/ - `<cluster name>`/ - `exported`/ - `stacks`/ - `control-plane/` - `stack.json` - `cloud-config-controller` - `<node pool name = stack name>`/ - `stack.json` Implementation details: Under the hood, kube-aws utilizes CloudFormation nested stacks to delegate management of multiple stacks as a whole. kube-aws now creates 1 root stack and nested stacks including 1 main(or currently named "control plane") stack and 0 or more node pool stacks. kube-aws operates on S3 to upload all the assets required by all the stacks(root, main, node pools) and then on CloudFormation to create/update/destroy a root stack. An example `cluster.yaml` I've been used to test this looks like: ```yaml clusterName: <your cluster name> externalDNSName: <your external dns name> hostedZoneId: <your hosted zone id> keyName: <your key name> kmsKeyArn: <your kms key arn> region: ap-northeast-1 createRecordSet: true experimental: waitSignal: enabled: true subnets: - name: private1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.1.0/24" private: true - name: private2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.2.0/24" private: true - name: public1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.3.0/24" - name: public2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.4.0/24" controller: subnets: - name: public1 - name: public2 loadBalancer: private: false etcd: subnets: - name: public1 - name: public2 worker: nodePools: - name: pool1 subnets: - name: asgPublic1a - name: pool2 subnets: # former `worker.subnets` introduced in v0.9.4-rc.1 via kubernetes-retired#284 - name: asgPublic1c instanceType: "c4.large" # former `workerInstanceType` in the top-level count: 2 # former `workerCount` in the top-level rootVolumeSize: ... rootVolumeType: ... rootVolumeIOPs: ... autoScalingGroup: minSize: 0 maxSize: 10 waitSignal: enabled: true maxBatchSize: 2 - name: spotFleetPublic1a subnets: - name: public1 spotFleet: targetCapacity: 1 unitRootVolumeSize: 50 unitRootvolumeIOPs: 100 rootVolumeType: gp2 spotPrice: 0.06 launchSpecifications: - spotPrice: 0.12 weightedCapacity: 2 instanceType: m4.xlarge rootVolumeType: io1 rootVolumeIOPs: 200 rootVolumeSize: 100 ```

@redbaron

This is an implementation of kubernetes-retired#238 from @redbaron especially what I've described in my comment there kubernetes-retired#238 (comment), and an answer to the request "**3. Node pools should be more tightly integrated**" of kubernetes-retired#271 from @Sasso . I believe this also achieves what was requested by @andrejvanderzee in kubernetes-retired#176 (comment). After applying this change: 1. All the `kube-aws node-pools` sub-commands are dropped 2. You can now bring up a main cluster and one or more node pools at once with `kube-aws up` 3. You can now update all the sub-clusters including a main cluster and node pool(s) by running `kube-aws update` 4. You can now destroy all the AWS resources spanning main and node pools at once with `kube-aws destroy` 5. You can configure node pools by defining a `worker.nodePools` array in cluster.yaml` 6. `workerCount` is dropped. Please migrate to `worker.nodePools[].count` 7. `node-pools/` and hence `node-pools/<node pool name>` directories, `cluster.yaml`, `stack-template.json`, `user-data/cloud-config-worker` for each node pool are dropped. 8. A typical local file tree would now look like: - `cluster.yaml` - `stack-templates/` (generated on `kube-aws render`) - `root.json.tmpl` - `control-plane.json.tmpl` - `node-pool.json.tmpl` - `userdata/` - `cloud-config-worker` - `cloud-config-controller` - `cloud-config-etcd` - `credentials/` - *.pem(generated on `kube-aws render`) - *.pem.enc(generated on `kube-aws validate` or `kube-aws up`) - `exported/` (generated on `kube-aws up --export --s3-uri <s3uri>`) - `stacks/` - `control-plane/` - `stack.json` - `user-data-controller` - `<node pool name = stack name>/` - `stack.json` - `user-data-worker` 9. A typical object tree in S3 would now look like: - `<bucket and directory from s3URI>`/ - kube-aws/ - clusters/ - `<cluster name>`/ - `exported`/ - `stacks`/ - `control-plane/` - `stack.json` - `cloud-config-controller` - `<node pool name = stack name>`/ - `stack.json` Implementation details: Under the hood, kube-aws utilizes CloudFormation nested stacks to delegate management of multiple stacks as a whole. kube-aws now creates 1 root stack and nested stacks including 1 main(or currently named "control plane") stack and 0 or more node pool stacks. kube-aws operates on S3 to upload all the assets required by all the stacks(root, main, node pools) and then on CloudFormation to create/update/destroy a root stack. An example `cluster.yaml` I've been used to test this looks like: ```yaml clusterName: <your cluster name> externalDNSName: <your external dns name> hostedZoneId: <your hosted zone id> keyName: <your key name> kmsKeyArn: <your kms key arn> region: ap-northeast-1 createRecordSet: true experimental: waitSignal: enabled: true subnets: - name: private1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.1.0/24" private: true - name: private2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.2.0/24" private: true - name: public1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.3.0/24" - name: public2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.4.0/24" controller: subnets: - name: public1 - name: public2 loadBalancer: private: false etcd: subnets: - name: public1 - name: public2 worker: nodePools: - name: pool1 subnets: - name: asgPublic1a - name: pool2 subnets: # former `worker.subnets` introduced in v0.9.4-rc.1 via kubernetes-retired#284 - name: asgPublic1c instanceType: "c4.large" # former `workerInstanceType` in the top-level count: 2 # former `workerCount` in the top-level rootVolumeSize: ... rootVolumeType: ... rootVolumeIOPs: ... autoScalingGroup: minSize: 0 maxSize: 10 waitSignal: enabled: true maxBatchSize: 2 - name: spotFleetPublic1a subnets: - name: public1 spotFleet: targetCapacity: 1 unitRootVolumeSize: 50 unitRootvolumeIOPs: 100 rootVolumeType: gp2 spotPrice: 0.06 launchSpecifications: - spotPrice: 0.12 weightedCapacity: 2 instanceType: m4.xlarge rootVolumeType: io1 rootVolumeIOPs: 200 rootVolumeSize: 100 ```

…preconfigured NAT gateway See kubernetes-retired#284 (comment) for more context

…netes-retired#284

…rk-topology Add support for customization of network topologies

@redbaron

This is an implementation of kubernetes-retired#238 from @redbaron especially what I've described in my comment there kubernetes-retired#238 (comment), and an answer to the request "**3. Node pools should be more tightly integrated**" of kubernetes-retired#271 from @Sasso . I believe this also achieves what was requested by @andrejvanderzee in kubernetes-retired#176 (comment). After applying this change: 1. All the `kube-aws node-pools` sub-commands are dropped 2. You can now bring up a main cluster and one or more node pools at once with `kube-aws up` 3. You can now update all the sub-clusters including a main cluster and node pool(s) by running `kube-aws update` 4. You can now destroy all the AWS resources spanning main and node pools at once with `kube-aws destroy` 5. You can configure node pools by defining a `worker.nodePools` array in cluster.yaml` 6. `workerCount` is dropped. Please migrate to `worker.nodePools[].count` 7. `node-pools/` and hence `node-pools/<node pool name>` directories, `cluster.yaml`, `stack-template.json`, `user-data/cloud-config-worker` for each node pool are dropped. 8. A typical local file tree would now look like: - `cluster.yaml` - `stack-templates/` (generated on `kube-aws render`) - `root.json.tmpl` - `control-plane.json.tmpl` - `node-pool.json.tmpl` - `userdata/` - `cloud-config-worker` - `cloud-config-controller` - `cloud-config-etcd` - `credentials/` - *.pem(generated on `kube-aws render`) - *.pem.enc(generated on `kube-aws validate` or `kube-aws up`) - `exported/` (generated on `kube-aws up --export --s3-uri <s3uri>`) - `stacks/` - `control-plane/` - `stack.json` - `user-data-controller` - `<node pool name = stack name>/` - `stack.json` - `user-data-worker` 9. A typical object tree in S3 would now look like: - `<bucket and directory from s3URI>`/ - kube-aws/ - clusters/ - `<cluster name>`/ - `exported`/ - `stacks`/ - `control-plane/` - `stack.json` - `cloud-config-controller` - `<node pool name = stack name>`/ - `stack.json` Implementation details: Under the hood, kube-aws utilizes CloudFormation nested stacks to delegate management of multiple stacks as a whole. kube-aws now creates 1 root stack and nested stacks including 1 main(or currently named "control plane") stack and 0 or more node pool stacks. kube-aws operates on S3 to upload all the assets required by all the stacks(root, main, node pools) and then on CloudFormation to create/update/destroy a root stack. An example `cluster.yaml` I've been used to test this looks like: ```yaml clusterName: <your cluster name> externalDNSName: <your external dns name> hostedZoneId: <your hosted zone id> keyName: <your key name> kmsKeyArn: <your kms key arn> region: ap-northeast-1 createRecordSet: true experimental: waitSignal: enabled: true subnets: - name: private1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.1.0/24" private: true - name: private2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.2.0/24" private: true - name: public1 availabilityZone: ap-northeast-1a instanceCIDR: "10.0.3.0/24" - name: public2 availabilityZone: ap-northeast-1c instanceCIDR: "10.0.4.0/24" controller: subnets: - name: public1 - name: public2 loadBalancer: private: false etcd: subnets: - name: public1 - name: public2 worker: nodePools: - name: pool1 subnets: - name: asgPublic1a - name: pool2 subnets: # former `worker.subnets` introduced in v0.9.4-rc.1 via kubernetes-retired#284 - name: asgPublic1c instanceType: "c4.large" # former `workerInstanceType` in the top-level count: 2 # former `workerCount` in the top-level rootVolumeSize: ... rootVolumeType: ... rootVolumeIOPs: ... autoScalingGroup: minSize: 0 maxSize: 10 waitSignal: enabled: true maxBatchSize: 2 - name: spotFleetPublic1a subnets: - name: public1 spotFleet: targetCapacity: 1 unitRootVolumeSize: 50 unitRootvolumeIOPs: 100 rootVolumeType: gp2 spotPrice: 0.06 launchSpecifications: - spotPrice: 0.12 weightedCapacity: 2 instanceType: m4.xlarge rootVolumeType: io1 rootVolumeIOPs: 200 rootVolumeSize: 100 ```

feat: Allow definiting private and public subnets to be chosen for wo…

c0221c3

…rker/controller/etcd nodes and a controller loadbalancer

mumoshu mentioned this pull request Jan 27, 2017

My findings and suggestions #271

Closed

This was referenced Jan 27, 2017

Allow to choose between ELB and Route53 round robin for the APIServer #281

Closed

CloudFormation: Assign elastic IPs from a certain pool to new nodes #219

Closed

mumoshu mentioned this pull request Jan 30, 2017

Create etcd and workers in private subnets, controllers in public subnet #169

Merged

mumoshu force-pushed the flexible-network-topology branch from 99a442a to 242783d Compare January 30, 2017 08:18

feat: Re-add support for the deployment to an existing subnet with a …

242783d

…preconfigured NAT gateway See kubernetes-retired#284 (comment) for more context

Re-add the support for existing use-cases of combining vpcId and mapP…

80885cb

…ublicIPs, routeTableId to let kube-aws create all the nodes and the api load-balancer inside either private or public subnets

mumoshu added 2 commits January 31, 2017 18:11

Drop subnets[].natGateway.preconfigured and subnets[].internetGateway…

546865d

….preconfigured completely and induce these from other settings like before

Generalize subnet names so that we can have mutiple private or public…

b24bc94

… subnets per AZ

mumoshu mentioned this pull request Feb 1, 2017

master branch kube's private internal subnets route table uses igw instead of NAT #290

Closed

Update cluster.yaml to cover all the changes and additions from kuber…

1b75cbe

…netes-retired#284

mumoshu changed the title ~~WIP: Add support for customization of network topologies~~ Add support for customization of network topologies Feb 1, 2017

mumoshu added this to the v0.9.4-rc.1 milestone Feb 1, 2017

mumoshu merged commit 9556f3d into kubernetes-retired:master Feb 1, 2017

mumoshu deleted the flexible-network-topology branch February 1, 2017 06:52

mumoshu mentioned this pull request Feb 12, 2017

Easier to configure, easier to version-control, more tightly integrated node pools #315

Merged

11 tasks

mumoshu mentioned this pull request Sep 6, 2017

The combination of mapPublicIPs and routeTableId in cluster.yaml is considered hard to configure #929

Closed

kylehodgetts pushed a commit to HotelsDotCom/kube-aws that referenced this pull request Mar 27, 2018

feat: Re-add support for the deployment to an existing subnet with a …

aa36a10

…preconfigured NAT gateway See kubernetes-retired#284 (comment) for more context

kylehodgetts pushed a commit to HotelsDotCom/kube-aws that referenced this pull request Mar 27, 2018

Update cluster.yaml to cover all the changes and additions from kuber…

4bdcb88

…netes-retired#284

kylehodgetts pushed a commit to HotelsDotCom/kube-aws that referenced this pull request Mar 27, 2018

Merge pull request kubernetes-retired#284 from mumoshu/flexible-netwo…

45394f0

…rk-topology Add support for customization of network topologies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for customization of network topologies #284

Add support for customization of network topologies #284

mumoshu commented Jan 26, 2017 •

edited

mumoshu commented Jan 26, 2017

codecov-io commented Jan 26, 2017 •

edited

redbaron commented Jan 26, 2017 •

edited

mumoshu commented Jan 26, 2017

redbaron commented Jan 26, 2017 •

edited

mumoshu commented Jan 26, 2017 •

edited

mumoshu commented Jan 26, 2017

redbaron commented Jan 26, 2017

mumoshu commented Jan 27, 2017 •

edited

mumoshu commented Jan 27, 2017 •

edited

whereisaaron commented Jan 28, 2017

neoandroid commented Jan 28, 2017

mumoshu commented Jan 29, 2017 •

edited

mumoshu commented Jan 29, 2017 •

edited

mumoshu commented Jan 29, 2017 •

edited

mumoshu commented Jan 30, 2017 •

edited

cknowles commented Jan 30, 2017

whereisaaron commented Jan 30, 2017

mumoshu commented Jan 30, 2017 •

edited

mumoshu commented Jan 30, 2017 •

edited

mumoshu commented Jan 30, 2017

mumoshu commented Jan 30, 2017

cknowles commented Jan 30, 2017 •

edited

mumoshu commented Jan 30, 2017 •

edited

mumoshu commented Jan 30, 2017

mumoshu commented Jan 30, 2017

cknowles commented Jan 30, 2017

whereisaaron commented Jan 30, 2017

mumoshu commented Jan 31, 2017

whereisaaron commented Jan 31, 2017 •

edited

mumoshu commented Jan 31, 2017 •

edited

whereisaaron commented Jan 31, 2017 •

edited

mumoshu commented Feb 1, 2017

mumoshu commented Feb 1, 2017

Add support for customization of network topologies #284

Add support for customization of network topologies #284

Conversation

mumoshu commented Jan 26, 2017 • edited

mumoshu commented Jan 26, 2017

codecov-io commented Jan 26, 2017 • edited

Codecov Report

redbaron commented Jan 26, 2017 • edited

mumoshu commented Jan 26, 2017

redbaron commented Jan 26, 2017 • edited

mumoshu commented Jan 26, 2017 • edited

mumoshu commented Jan 26, 2017

redbaron commented Jan 26, 2017

mumoshu commented Jan 27, 2017 • edited

mumoshu commented Jan 27, 2017 • edited

whereisaaron commented Jan 28, 2017

neoandroid commented Jan 28, 2017

mumoshu commented Jan 29, 2017 • edited

mumoshu commented Jan 29, 2017 • edited

mumoshu commented Jan 29, 2017 • edited

mumoshu commented Jan 30, 2017 • edited

cknowles commented Jan 30, 2017

whereisaaron commented Jan 30, 2017

mumoshu commented Jan 30, 2017 • edited

mumoshu commented Jan 30, 2017 • edited

mumoshu commented Jan 30, 2017

mumoshu commented Jan 30, 2017

cknowles commented Jan 30, 2017 • edited

mumoshu commented Jan 30, 2017 • edited

mumoshu commented Jan 30, 2017

mumoshu commented Jan 30, 2017

cknowles commented Jan 30, 2017

whereisaaron commented Jan 30, 2017

mumoshu commented Jan 31, 2017

whereisaaron commented Jan 31, 2017 • edited

mumoshu commented Jan 31, 2017 • edited

whereisaaron commented Jan 31, 2017 • edited

mumoshu commented Feb 1, 2017

mumoshu commented Feb 1, 2017

mumoshu commented Jan 26, 2017 •

edited

codecov-io commented Jan 26, 2017 •

edited

redbaron commented Jan 26, 2017 •

edited

redbaron commented Jan 26, 2017 •

edited

mumoshu commented Jan 26, 2017 •

edited

mumoshu commented Jan 27, 2017 •

edited

mumoshu commented Jan 27, 2017 •

edited

mumoshu commented Jan 29, 2017 •

edited

mumoshu commented Jan 29, 2017 •

edited

mumoshu commented Jan 29, 2017 •

edited

mumoshu commented Jan 30, 2017 •

edited

mumoshu commented Jan 30, 2017 •

edited

mumoshu commented Jan 30, 2017 •

edited

cknowles commented Jan 30, 2017 •

edited

mumoshu commented Jan 30, 2017 •

edited

whereisaaron commented Jan 31, 2017 •

edited

mumoshu commented Jan 31, 2017 •

edited

whereisaaron commented Jan 31, 2017 •

edited