Skip to content
This repository has been archived by the owner on Sep 26, 2021. It is now read-only.

Machine create fails with latest Docker #4156

Closed
dminkovsky opened this issue Jun 29, 2017 · 46 comments
Closed

Machine create fails with latest Docker #4156

dminkovsky opened this issue Jun 29, 2017 · 46 comments

Comments

@dminkovsky
Copy link

dminkovsky commented Jun 29, 2017

Hello

docker-machine version 0.12.0, build 45c69ad

docker-machine create fails now:

docker-machine -D create \
    --driver google \
    --google-project project \
    --google-zone us-east1-d \
    --google-machine-type n1-standard-1 \
    --google-disk-size 20 \
    --google-preemptible \
    build-vm2

The machine is created and Docker is installed, but it won't start. The problem appears to be a related to a new version of Docker getting installed by a new version of the install script over at https://get.docker.com. My installs went from 17.05.0-ce to 17.06.0-ce, and with that change, Docker installs but does not start.

Jun 29 00:50:08 build-vm2 docker[5705]: `docker daemon` is not supported on Linux. Please run `dockerd` directly

or

Jun 29 00:56:12 build-vm2 dockerd[6407]: Error starting daemon: error initializing graphdriver: driver not supported

Unless I change:

/usr/bin/docker daemon -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver aufs --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=google

to

/usr/bin/dockerd -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=google

in /etc/systemd/system/docker.service.d/10-machine.conf.

@dminkovsky dminkovsky changed the title Machine create fails due to Docker update Machine create fails with latest Docker Jun 29, 2017
@thanhhx-zinza
Copy link

Same problem here

docker-machine create 
    --driver=digitalocean
    --digitalocean-access-token=XXX 
    --digitalocean-size=2gb
    machinename

Yesterday the same command worked fine with docker version 17.05.0-ce
Today my new machine's docker won't start (17.06.0-ce)
I've tried multiple time.

@gnomus
Copy link

gnomus commented Jun 29, 2017

I can confirm this too:

dm create -d digitalocean \
--digitalocean-access-token XXX \
--digitalocean-size 4gb machine

@therealppa
Copy link

therealppa commented Jun 29, 2017

I'm using this as a workaround:

docker-machine create
--driver amazonec2
--engine-install-url=https://web.archive.org/web/20170623081500/https://get.docker.com

or
--engine-install-url=https://releases.rancher.com/install-docker/17.05.sh

@acouette
Copy link

I have the same issue.

docker version : Docker version 17.06.0-ce
docker-machine version : 0.12.0, build 45c69ad

docker-machine create --driver amazonec2 --amazonec2-region eu-west-1 --amazonec2-instance-type t2.small --amazonec2-access-key XXX --amazonec2-secret-key XXX test-create-machine

Jun 29 12:26:56 ip-172-31-10-149 systemd[1]: Starting Docker Application Container Engine...
Jun 29 12:26:56 ip-172-31-10-149 docker[5234]: docker daemon is not supported on Linux. Please run dockerd directly

docker daemon is not supported on Linux. Please run dockerd directly

@gnomus
Copy link

gnomus commented Jun 29, 2017

I was able to get it working with this PR
#4128

Just compile docker-machine with this fix and everything works again

@dminkovsky
Copy link
Author

@gnomus super, that's interesting! I wonder why it was working for 17.05.0-ce, though.

@therealppa haahaha awesome! I was wondering how I might get the old version of that script, or whether the live script takes params to install older version. web.archive.org definitely didn't not occur to me.

@therealppa
Copy link

@dminkovsky I don't think it will work forever, if you look into the script it doesn't actually specify the version anywhere... Still, right now it works.

@kurrestahlberg
Copy link

@therealppa @dminkovsky A longer term fix is to change the line 457 of the script from

$sh_c 'apt-get install -y -q docker-ce'

to

$sh_c "apt-get install -y -q docker-ce=17.05.0~ce-0~$lsb_dist-$dist_version"

Hopefully the fixed version of docker-machine is released soon.

@fabio-barile
Copy link

same for me
We make it working by using "dockerd" instead of "docker daemon" in the file /etc/systemd/system/docker.service.d/10-machine.conf

@dminkovsky
Copy link
Author

@fabio-barile what about the --storage-driver aufs arg? Mine wouldn't start unless I got rid of that, too.

@JustEra
Copy link

JustEra commented Jun 29, 2017

@dminkovsky I had the same problem on a autoscaling ci with gitlab, got the aufs problem + dockerd problem, had to solve it with specifying overlay in the storage driver.

@lusitania
Copy link

lusitania commented Jun 29, 2017

Beyond the storage driver issue I'm also seeing verification errors for certificates created by gitlab-runner (9.3.0). @JustEra have you been running into the same issue or am I the only one?

http: TLS handshake error from ...:
 tls:
  failed to verify client's certificate: x509:
   certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "unknown")
ERROR: Error creating machine:
 Error checking the host:
  Error checking and/or regenerating the certs:
   There was an error validating certificates for host "...":
    remote error: tls: bad certificate  driver=amazonec2 name=...

@aleks-m
Copy link

aleks-m commented Jun 29, 2017

This fixed storage-driver issue for me (just removed that parameter; for systemd ONLY). Apply on top of #4128 and re-build:

diff --git a/libmachine/provision/systemd.go b/libmachine/provision/systemd.go
index 90d02603..05d63bb5 100644
--- a/libmachine/provision/systemd.go
+++ b/libmachine/provision/systemd.go
@@ -53,7 +53,7 @@ func (p *SystemdProvisioner) GenerateDockerOptions(dockerPort int) (*DockerOptio

        engineConfigTmpl := `[Service]
 ExecStart=
-ExecStart=/usr/bin/` + arg + ` -H tcp://0.0.0.0:{{.DockerPort}} -H unix:///var/run/docker.sock --storage-driver {{.EngineOptions.StorageDriver}} --tlsverify --tlscacert {{.AuthOptions.CaCertRemotePath}} --tlscert {{.AuthOptions.ServerCertRemotePath}} --tlskey {{.AuthOptions.ServerKeyRemotePath}} {{ range .EngineOptions.Labels }}--label {{.}} {{ end }}{{ range .EngineOptions.InsecureRegistry }}--insecure-registry {{.}} {{ end }}{{ range .EngineOptions.RegistryMirror }}--registry-mirror {{.}} {{ end }}{{ range .EngineOptions.ArbitraryFlags }}--{{.}} {{ end }}
+ExecStart=/usr/bin/` + arg + ` -H tcp://0.0.0.0:{{.DockerPort}} -H unix:///var/run/docker.sock --tlsverify --tlscacert {{.AuthOptions.CaCertRemotePath}} --tlscert {{.AuthOptions.ServerCertRemotePath}} --tlskey {{.AuthOptions.ServerKeyRemotePath}} {{ range .EngineOptions.Labels }}--label {{.}} {{ end }}{{ range .EngineOptions.InsecureRegistry }}--insecure-registry {{.}} {{ end }}{{ range .EngineOptions.RegistryMirror }}--registry-mirror {{.}} {{ end }}{{ range .EngineOptions.ArbitraryFlags }}--{{.}} {{ end }}

@vincent99
Copy link

For anyone who wants a specific older version, we (Rancher) maintain slightly modified get.docker.com scripts to install each one:

http://rancher.com/docs/rancher/v1.6/en/hosts/#supported-docker-versions

@narration-sd
Copy link

narration-sd commented Jun 30, 2017

@fabio-barile above is entirely correct. How 'testing' lets such things be emitted, can't imagine.

More information here: docker/for-linux#11 (comment)

@vincent99 ...always like the sound of you guys, and thanks.

@systemmonkey42
Copy link

+1
I check back every day for a new docker-machine release... This bug is killing me :-)

For now, I add /etc/systemd/system/docker.service.d/20-machine.conf which overrides 10-machine.conf with the correct command line. That way further docker-machine command which would normally break it, doesn't. Of course the longer it takes for this to be fixed in the release, the more work I have putting everything back!

@FrenchBen
Copy link
Contributor

FrenchBen commented Jun 30, 2017

Thanks for the great breakdown of details on the issue - We're looking into it to try and figure out what went wrong.

related to docker/for-linux#11 (comment)

@seemethere
Copy link
Contributor

So this is not related to the install script at get.docker.com but rather related to the version comparison not working correctly and with 17.06.0-ce being the first to officially deprecate docker daemon that is why we are seeing failures.

This PR (#4128) seems to remedy this issue and I'll have a PR up by late afternoon that adds tests for the other comparison functions so that we don't run into something like this again.

seemethere added a commit to seemethere/machine that referenced this issue Jun 30, 2017
Relates to issues raised in docker#4156

Signed-off-by: Eli Uriegas <seemethere101@gmail.com>
seemethere added a commit to seemethere/machine that referenced this issue Jun 30, 2017
Relates to issues raised in docker#4156

Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>
@narration-sd
Copy link

@seemethere Sounds good, thanks. Like to hear about the test.

The diff on one of the PRs appeared a little odd to me, but think you guys will have taken care of that.

@shin-
Copy link
Contributor

shin- commented Jul 7, 2017

@dminkovsky Can you create a new issue for this? It's unrelated to the dockerd/docker daemon issue, so we should treat it separately as well. And please indicate what OS you're provisioning as well :)

@dminkovsky
Copy link
Author

@shin- i'm all good. docker-machine is working 100% right now for me. are you referring to the overlay2 thing?

@drujensen
Copy link

My other issue regarding removing machines was addressed in pr #4187. Thx.

@shin-
Copy link
Contributor

shin- commented Jul 10, 2017

@dminkovsky Sorry - yes, the one you mention here

@brandontamm
Copy link

@shin - After experiencing the issue in #4168, I attempted to re-create my staging server and found a slough of issues with docker-machine create that has been reported in multiple recent tickets:

Are these all related? Start tracking these here? I can confirm that this issue is still happening today.

@costa
Copy link

costa commented Jul 19, 2017

@shin- docker-machine v0.12.1 exhibits the same issue still

@eamontaaffe
Copy link

I'm still getting the same issue with version 0.12.1.

screen shot 2017-07-27 at 11 32 00 am

@FrenchBen
Copy link
Contributor

Please update to the latest release found on github:
https://github.com/docker/machine/releases/tag/v0.12.2

@eamontaaffe @Ajwah @costa

@oleynikd
Copy link

oleynikd commented Aug 2, 2017

Thank you @dminkovsky I was getting this error on 0.12.2 today as well!!! Seems like 10-machine.conf file does not get overridden during update

@dminkovsky
Copy link
Author

dminkovsky commented Aug 2, 2017 via email

@cadavre
Copy link

cadavre commented Aug 2, 2017

If using systems with kernel >4.4 I suggest using overlay2.

@dminkovsky
Copy link
Author

dminkovsky commented Aug 2, 2017 via email

@datesss
Copy link

datesss commented Aug 3, 2017

Also getting this error on 0.12.2 :-(

@kassanmoor
Copy link

this still opened!

@AkhilGNair
Copy link

I still see this issue with docker-machine 0.12.2. I moved forwards by uninstalling docker on the provisioned machine (sudo apt purge docker-ce && sudo apt autoremove) and used correct Rancher install script for my version as listed above.

For some reason, this still fails to start docker, but rebooting the machine then solves it.

@jhartma
Copy link

jhartma commented Sep 23, 2017

Can confirm, still the same error

@kassanmoor
Copy link

@jhartma I guess is necessary upgrade to latest release (linux image) and works

@jhartma
Copy link

jhartma commented Sep 29, 2017

@kassanmoor seems my AMI didn't support it on AWS, I got it to work with the default one

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests