Proposal: Machine Declaration #773

ehazlett · 2015-03-12T13:13:23Z

We currently use store to persist machine information (IP, credentials, name, etc). This works mostly but has some flaws. First, if the infrastructure is modified (machine name, size, IP, etc) or removed then we drift. The store almost has to be assumed in drift as soon as we create a machine. We also do not update the store with the machine info either.

This proposal would bring together a few concepts that have been brought up before: a machine configuration file and discovery based machine information. This would also have a docker-compose like feel as well. It pulls some ideas from Terraform. Here is the idea:

Machines are defined in a configuration file (i.e. docker-machine.yml)
- Implementation detail: we would still allow (machine create -d ec2 ...) but simply update / create the config
This configuration file would store credentials (or use env vars, etc) and defaults for that specific group of machines
- This would allow for multiple environments potentially spanning the same machines (dev, staging, etc)
A new command (i.e. apply) would allow machine to apply the configuration (launching any instances needed, etc)
- For example, if the configuration states 5 t1.micro instances in ec2 and there are only 4 a new one is created

I think this would also play well with the idea of Machine Server. It would be nice to see Machine Server create new node(s) if they crash etc. I think a great integration would be where a Swarm node dies and Machine Server automatically launches a new instance and adds it back to the cluster.

Questions / Thoughts

What should the configuration look like?
We would need an identification mechanism for the instances (labeling, etc)

Huge thanks to @gabrtv for the discussion and idea :)

The text was updated successfully, but these errors were encountered:

ehazlett · 2015-03-12T15:21:20Z

/cc @sthulb @nathanleclaire @bfirsh

nathanleclaire · 2015-03-12T22:17:10Z

I'm really in favor of this. "Read in a text file, spit out the system in the desired configuration" is a good goal.

However, I think we need to carefully design this before we start implementing. Some concerns I can think of off the top of my head:

I create a machine based on my file, then go change some settings remotely. Does machine detect that and update the file? Or simply converge the system back to the original state from the file?
I'm pretty sure that we will need at least two separate files, like how Terraform has tfvars, so that one can be easily kept out of version control.
Something we need to think about is, we probably don't want everyone to have lots of per-project VMs sprouting up
How this would tie in with config if we put that in (docker-machine config set Drivers.DigitalOcean.Image docker to globally set the default DO image, for instance)? The file is just the highest priority I suppose? Or if you do an apply it chucks all the other stuff (env vars, config) out the window and always starts from a clean slate?
Stuff like re-creating failed hosts starts to get into the turf of tools like Mesos. Where are the boundaries and how do we avoid duplicating effort / reinventing the wheel?
I like Terraform's "plan-before-you-apply" model. Is that included in scope?

Like I said, I'm in favor, let's consider carefully before implementing though.

ehazlett · 2015-03-13T00:57:28Z

Absolutely. As titled "proposal" this is for discussion :)

I create a machine based on my file, then go change some settings remotely. Does machine detect that and update the file? Or simply converge the system back to the original state from the file?

This is what we need to discuss. I can see pros and cons for both ways.

I'm pretty sure that we will need at least two separate files, like how Terraform has tfvars, so that one can be easily kept out of version control.

I am leaning this way too.

Something we need to think about is, we probably don't want everyone to have lots of per-project VMs sprouting up

I don't think we should impose that. Who are we to decide how people use it? For example, there is nothing restricting compose users from spinning up lots of containers. I think we should make people very aware of what it is doing without imposing restrictions or decisions on how they design their infrastructure.

I could see a workflow similar to this:

Ops uses docker-machine.yml to configure a staging environment
Dev uses docker-compose.yml to build their stacks on the environment

At this point, the environment is simply a service that dev consumes and ops can ensure how it is ran. I don't necessarily think there would be a docker-machine.yml per project.

How this would tie in with config if we put that in (docker-machine config set Drivers.DigitalOcean.Image docker to globally set the default DO image, for instance)? The file is just the highest priority I suppose? Or if you do an apply it chucks all the other stuff (env vars, config) out the window and always starts from a clean slate?

I am still not convinced of the "global config file". I can see the advantage of having a single place for all of the things but like git, most of us use localized settings and I'm not sure about having several configs in place. I like simplicity and the definition of a "staging" environment that has all of it's definition is very appealing. What I didn't like about config mgmt. systems was all of the inheritance spread throughout.

Stuff like re-creating failed hosts starts to get into the turf of tools like Mesos. Where are the boundaries and how do we avoid duplicating effort / reinventing the wheel?

I don't think so. Ensuring a host is up or action is taken upon fail would be a huge benefit for machine server and could actually work in tandem with projects like Mesos. For example, instead of some config mgmt tool or vendor provided, you could use machine server to ensure 5 nodes are always up. Mesos would then ensure what containers are supposed to be on those are there.

I like Terraform's "plan-before-you-apply" model. Is that included in scope?

Absolutely :)

hairyhenderson · 2015-03-13T01:28:26Z

+1

This makes a lot of sense - it would be nice in a UI sense for docker-machine and docker-compose to have more parallels. For people just starting to use and try to understand the Docker tools, it would probably be a huge help to have this kind of parity.

A few random thoughts:

Why apply and not up? docker-machine up feels more natural if taking cues from docker-compose
Having some way to read separate files for secret things definitely makes sense, I think referencing external files makes the most sense (something like: external_config: my_secret_file), so that docker-machine doesn't have to be told about multiple different files, and so that I can share files that other systems might use or control
It would be useful to be able to set the driver on a per-host basis (e.g. I want 10 hosts in rackspace, 10 hosts in ec2, and 10 hosts in softlayer)
It would also be useful to be able to set a driver as default for all hosts defined in the file
In the case where there's an existing set of hosts and I change something in the .yml (like swap out a t1.micro for a t1.small), I think either I should have to add a --yes-i-really-want-to-destroy-a-vm-and-bring-up-a-new-one flag, or there should be a separate command altogether.
Should there be some extra metadata telling me which config a host listed in docker-machine ls came from?
It'd be neat if I could set a region attribute to an array of different regions, and machine would spread my instances across each region (i.e. I set instances: 9 and softlayer-region: [ tor01, dal05, sjc01 ], and end up with 3 hosts in each)
Hosts should be brought up in parallel as much as possible, except that swarm nodes should only come up after their master is available

I'm thinking a file might look like this:

# docker-machine.yml
osswarmmaster:
  driver: openstack
  openstack-flavor-name: tiny
  openstack-image-name: Ubuntu 14.04 LTS
  openstack-floatingip-pool: myfloatingips
  swarm-master: true
  swarm-discovery: token://1234
myawesomevm:
  driver: openstack
  openstack-flavor-name: large
  openstack-image-name: Ubuntu 14.04 LTS
  openstack-floatingip-pool: myfloatingips
  instances: 4
  swarm-discovery: token://1234
slbigbox:
  external_file: softlayer-secrets.yml
  driver: softlayer
  softlayer-cpu: 4
  softlayer-disk-size: 100
  softlayer-memory: 8192
  softlayer-region: [ tor01, dal05, sjc01 ]
  instances: 15

# softlayer-secrets.yml
softlayer-user: fred
softlayer-api-key: 1234-5678-9012

The hosts resulting from this could then be named something like:

osswarmmaster
myawesomevm_1, myawesomevm_2, etc...
slbigbox_1, slbigbox_2, etc...

ehazlett · 2015-03-13T15:46:13Z

@hairyhenderson great feedback! thanks!

Why apply and not up? docker-machine up feels more natural if taking cues from docker-compose

I'm not set on the command names -- I think apply makes sense if we make it declarative as in if there are 6 instances with the identifying tag and we remove one to match the definition. However, if we just operate like compose does (it will ignore additional containers i believe) then up would make sense too.

It would also be useful to be able to set a driver as default for all hosts defined in the file

+1

In the case where there's an existing set of hosts and I change something in the .yml (like swap out a t1.micro for a t1.small), I think either I should have to add a --yes-i-really-want-to-destroy-a-vm-and-bring-up-a-new-one flag, or there should be a separate command altogether.

Yeah I'm not sure how we would handle this. Perhaps the driver would have to support a "Modify" operation that would do some rolling modification. In the case of EC2, it would simply stop the instance and change the type (assuming we use EBS which we currently do). However, not all drivers support this so we would have to figure those out. We could also take a cue from Terraform and support in-place modifications for certain operations or a create/destroy routine for those that don't.

Should there be some extra metadata telling me which config a host listed in docker-machine ls came from?
I'm leaning towards machine only using a single config to show what that environment looks like. We could also have a --config option or similar to specify certain ones (like compose).

It'd be neat if I could set a region attribute to an array of different regions, and machine would spread my instances across each region (i.e. I set instances: 9 and softlayer-region: [ tor01, dal05, sjc01 ], and end up with 3 hosts in each)
Absolutely!!

Hosts should be brought up in parallel as much as possible, except that swarm nodes should only come up after their master is available
+1. Actually, swarm nodes do not need their master to be available -- you can start them all together and when the master is up, it will query the discovery service for what nodes are members.

thaJeztah · 2015-03-13T19:27:05Z

I'm pretty sure that we will need at least two separate files, like how Terraform has tfvars, so that one can be easily kept out of version control.

+1. In future this could be extended to allow other storage mechanisms than a file.

FYI; this proposal in docker-compose is leaning toward having two separate files as well: docker/compose#846 (a "definition" and a "configuration" file)

sthulb · 2015-03-16T10:16:30Z

I like the concept.

Areas of interest

What happens if a user updates their file, i.e. update the size of the VM. Do we perform a migration?
How do we handle this in a client/server model? Do we store these files on the server and sync them back to the user?
How do we handle failure?

There's probably a few more behaviour issues to work out.

ehazlett · 2015-03-17T15:31:11Z

@thaJeztah cool thx!

sthulb · 2015-04-08T15:18:22Z

@ehazlett Can we make this actually support compose syntax? So people can get swarms/machines up running containers?

ehazlett · 2015-04-08T15:32:51Z

@sthulb i would love to see that :) I think it would also be a good integration with compose as well.

errordeveloper · 2015-04-28T17:00:40Z

👍

ghost · 2015-07-10T19:39:08Z

Something I would be concerned about with a declarative file is the handling of sensitive information. Presumably someone will want or need to check their config into source control and I've seen too many horror stories of people being charged hundreds of dollars due to bots constantly scanning GitHub and other sites for keys. Possible solutions include taking the key from environmental variable, being prompted for the key, have the key in an encrypted file (e.g. Ansible Vault) and be prompted for the password or taking it from environmental variable to unlock it.

hairyhenderson · 2015-07-10T20:54:29Z

@usertaken - very good point. If we take a cue from docker-compose.yml, then we could use an env_file property, which enables users to keep secrets in files but out of source control. Obviously, that adds an extra step in CI builds since secrets need to be written to temporary files, then deleted from those files.

thaJeztah · 2015-07-10T21:16:08Z

Handling of secrets still is a hot topic. If an env-file is supported, tools such as HashiCorp Vault, Keywhiz or Sneaker could be useful.

Also, I requested the Docker security maintainers to write up their thoughts / recommendations here; moby/moby#13490

nathanleclaire · 2015-07-10T22:57:36Z

To be clear, in terms of secrets such as API tokens which might be needed in such a docker-machine.yml file, I would like to support either inheriting them from the environment or keeping them in some other secondary "var" file which is deliberately meant to be kept out of version control. Either way, we should actively discourage having them in whichever file is meant explicitly to be checked into version control.

kacole2 · 2015-09-29T17:45:20Z

bringing this back from the dead (July 10th was last response)

i really like this concept:

Ops uses docker-machine.yml to configure a staging environment
Dev uses docker-compose.yml to build their stacks on the environment

i would also like to see a number somewhere in these descriptor files. Like I want 10 of type X and 5 of type Y. The reason being is that the underlying infrastructure may need to be tailored the apps, networking, or storage access. As @ehazlett said before, let's not limit what a user wants to do.

I hope that my PR #1881 shows a working concept of using additional configuration options. Would like to see that functionality added down the road.

nathanleclaire · 2015-09-29T18:19:53Z

@kacole2 I agree that something like count: 5 would be useful and have made moves to support it in the past with flags like --n-instances (never successfully merged), so I'd like to add something of the sort if we implement functionality like this.

Likewise, hopefully the new driver plugin model will also open the doors to extensible functionality in other areas.

kacole2 · 2015-09-30T19:26:30Z

@nathanleclaire where can i learn more about the plugin model? Assuming #1626? would like to contribute where possible to make it a reality.

nathanleclaire · 2015-09-30T20:09:08Z

@kacole2 Yep, that's the proposal, and #1902 is the PR

krasi-georgiev · 2015-12-09T13:14:32Z

is this on hold ?

nathanleclaire · 2015-12-09T19:56:41Z

@vipconsult Sort of. @kunalkushwaha has a POC here: #2422 and we're talking about possibly trying to implement it for 0.6.0 (January), but we can't make any promises -- it's a very big thing to commit to implementing such a feature, and we would need to get feedback from a variety of other Docker teams (for instance, is this encroaching on Compose territory?) and users before making moves.

schmunk42 · 2016-02-28T14:29:33Z

I'd really like to see this.
Why don't you split this into a separate project like docker/docker-compose?

For reference: https://github.com/efrecon/machinery

krasi-georgiev · 2016-02-28T14:54:41Z

good idea

kunalkushwaha · 2016-03-10T02:51:02Z

I think a separate project, will be more of wrapper around docker-machine and compose.
I think best way to add few features in libcompose like docker/libcompose#157
and integrate them with machine could be better.

joelhandwell · 2017-04-09T10:47:34Z

If implementing this feature in a different project make sense, can implementing it as a terraform plugin (either docker_machine resource and/or docker_machine provisioner) or adopting HCL as configuration language for docker-machine be considered ? I think re-implementing whole terraform for docker-machine is too wide scope. And if we go with plain YAML, people will start to complain the lack of string interpolation which is already implemented as HCL. Let's say AWS launched new EC2 feature and terraform development team and docker-machine development team work on adopting same feature. This is nothing but duplicated effort and wasting open source development human resource. If docker-machine project members are freed from cloning a portion of terraform or HCL, they can focus on swarm compatibility or swarm integration which are the things of docker. AWS adds around 1000 new features per year and the speed is accelerating every year. terraform community so far is catching up this pace by adopting those features as soon as it released. With current development activity, we might need to consider catching up this speed can be realistic goal for docker-machine community.

Code of launching 16 docker swarm nodes as EC2 instances can be like this:
docker_host.hcl

resource "aws_instance" "docker_host" {
  count = 16
  ami = "${data.aws_ami.docker_host.id}"
  instance_type = "t2.medium"
  vpc_security_group_ids = [
    "${aws_security_group.docker_host.id}"
  ]
}

resource "docker_machine" "docker_host" {
  count = 16
  swarm = true
  swarm-master = "${count.index < 3 ? true : false}" 
  aws_instance_id = "${aws_instance.docker_host.*.id}"
  ssh_key = "${var.docker_host.ssh_key}"
}

And command can be terraform apply or docker-machine create --config dockerhost.hcl

I hope unix philosophy is also applicable for docker-machine by doing docker thing, and do it well.

hairyhenderson · 2017-04-09T16:49:27Z

@joelhandwell I totally agree - would love to see this sort of thing. There's some definite crossover with projects like Docker for AWS/Azure/GCP, and Infrakit (see especially https://github.com/docker/infrakit/tree/master/examples/instance/terraform).

To be honest, one of the reasons that I haven't spent much time with Docker Machine lately is because Docker for AWS meets my needs much better. I've been using Terraform to apply the D4AWS CloudFormation template mostly. ¯\_(ツ)_/¯

nathanleclaire mentioned this issue Mar 16, 2015

First attempt at docker-machine share #793

Closed

This was referenced Mar 30, 2015

hooked a PostCreateCheck for checking the configuration setting #892

Closed

Add more store for machine #913

Closed

nathanleclaire mentioned this issue Apr 7, 2015

Proposal: Support configuration of created Docker Engines from Machine CLI #974

Closed

ehazlett mentioned this issue Apr 10, 2015

Machine should manage credentials for drivers #987

Closed

nathanleclaire added the kind/enhancement label Jul 6, 2015

nathanleclaire mentioned this issue Sep 21, 2015

add 3rd party docker extensions #1881

Closed

hairyhenderson mentioned this issue Oct 22, 2015

Discussion: Have some docker-machine commands default to "default" VM? #1783

Closed

nathanleclaire mentioned this issue Nov 19, 2015

[Feature request] Install Swarm on already running machine #2303

Open

This was referenced Nov 26, 2015

(WIP) Machine declaration #2422

Open

Proposal: Alter configuration of created machine #2440

Closed

dgageot added kind/enhancement and removed kind/enhancement labels Jan 29, 2016

joelhandwell mentioned this issue Jan 28, 2017

Config file for docker-machine create #1802

Open

joelhandwell added a commit to joelhandwell/dockerhost that referenced this issue Apr 10, 2017

Mention docker/machine#1709 and docker/machine#773

b51acf2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Machine Declaration #773

Proposal: Machine Declaration #773

ehazlett commented Mar 12, 2015

ehazlett commented Mar 12, 2015

nathanleclaire commented Mar 12, 2015

ehazlett commented Mar 13, 2015

hairyhenderson commented Mar 13, 2015

ehazlett commented Mar 13, 2015

thaJeztah commented Mar 13, 2015

sthulb commented Mar 16, 2015

ehazlett commented Mar 17, 2015

sthulb commented Apr 8, 2015

ehazlett commented Apr 8, 2015

errordeveloper commented Apr 28, 2015

ghost commented Jul 10, 2015

hairyhenderson commented Jul 10, 2015

thaJeztah commented Jul 10, 2015

nathanleclaire commented Jul 10, 2015

kacole2 commented Sep 29, 2015

nathanleclaire commented Sep 29, 2015

kacole2 commented Sep 30, 2015

nathanleclaire commented Sep 30, 2015

krasi-georgiev commented Dec 9, 2015

nathanleclaire commented Dec 9, 2015

schmunk42 commented Feb 28, 2016

krasi-georgiev commented Feb 28, 2016

kunalkushwaha commented Mar 10, 2016

joelhandwell commented Apr 9, 2017 •

edited

hairyhenderson commented Apr 9, 2017

Proposal: Machine Declaration #773

Proposal: Machine Declaration #773

Comments

ehazlett commented Mar 12, 2015

ehazlett commented Mar 12, 2015

nathanleclaire commented Mar 12, 2015

ehazlett commented Mar 13, 2015

hairyhenderson commented Mar 13, 2015

ehazlett commented Mar 13, 2015

thaJeztah commented Mar 13, 2015

sthulb commented Mar 16, 2015

Areas of interest

ehazlett commented Mar 17, 2015

sthulb commented Apr 8, 2015

ehazlett commented Apr 8, 2015

errordeveloper commented Apr 28, 2015

ghost commented Jul 10, 2015

hairyhenderson commented Jul 10, 2015

thaJeztah commented Jul 10, 2015

nathanleclaire commented Jul 10, 2015

kacole2 commented Sep 29, 2015

nathanleclaire commented Sep 29, 2015

kacole2 commented Sep 30, 2015

nathanleclaire commented Sep 30, 2015

krasi-georgiev commented Dec 9, 2015

nathanleclaire commented Dec 9, 2015

schmunk42 commented Feb 28, 2016

krasi-georgiev commented Feb 28, 2016

kunalkushwaha commented Mar 10, 2016

joelhandwell commented Apr 9, 2017 • edited

hairyhenderson commented Apr 9, 2017

joelhandwell commented Apr 9, 2017 •

edited