Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[master] Do not remove the supervisor container before every start #544

Merged
merged 2 commits into from
Feb 1, 2017

Conversation

pcarranzav
Copy link
Contributor

[Untested, this is my first attempt]
Instead, we add a start-resin-supervisor script that checks
whether the existing container's image matches the one specified
in supervisor.conf, and just starts that container in that case.
It falls back to the docker run command that creates and starts
the supervisor container.

We also add the container removal in the update-resin-supervisor script,
after stopping the resin-supervisor service.

Closes #184
Also connects to #280 as it should make the issue appear way less often.

Signed-off-by: Pablo Carranza Velez pablo@resin.io

SUPERVISOR_CONTAINER_IMAGE_ID=$(docker inspect resin_supervisor | jq '.[0].Image')

if [ "$SUPERVISOR_IMAGE_ID" -eq "$SUPERVISOR_CONTAINER_IMAGE_ID" ]; then
docker start resin_supervisor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like a fallback to removing the container if starting it fails - it can help with auto-recovery in some cases (I think that's why we initially switched to always removing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -133,6 +133,7 @@ fi
# Try to stop old supervisor to prevent it deleting the intermediate images while downloading the new one
echo "Stop supervisor..."
systemctl stop resin-supervisor
$DOCKER rm --force resin_supervisor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be moved to after the successful pull, so that we don't remove the container if the pull fails

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, but I also moved the supervisor.conf change so that it is only changed after a successful pull


source /usr/sbin/resin-vars

SUPERVISOR_IMAGE_ID=$(docker inspect $SUPERVISOR_IMAGE | jq '.[0].Id')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth using the --format option, although I'm not certain what version of docker it was introduced in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, done.
(It is indeed supported in 1.10.3)

@@ -46,6 +47,7 @@ RDEPENDS_${PN} = " \
systemd \
curl \
resin-device-uuid \
jq \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this RDEPENDS added? I do not see any jq invocation in the start script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had added it because I used it in the start-resin-supervisor script, but it's not necessary anymore. Removed.

@@ -0,0 +1,34 @@
#!/bin/bash

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be useful to set a error out flag on exit code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

SUPERVISOR_CONTAINER_IMAGE_ID=$(docker inspect --format='{{.Image}}' resin_supervisor)

runSupervisor() {
docker rm --force resin_supervisor || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we try to start it afterwards if it cannot remove the old supervisor? Should we error out if we need to start a new supervisor but we can't remove the current container?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The container might not be there at all, that's why (and it's how the service used to work, as it had a - in the ExecStartPre)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also,if the removal failed and there is actually a container there (which would only happen with a docker bug or some other weird corruption), then the docker run command that follows will fail anyways (with the good old "name already taken" error)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha.

@@ -0,0 +1,34 @@
#!/bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need bash for this script. POSIX should should be enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@agherzan
Copy link
Contributor

agherzan commented Jan 17, 2017

3 more comments:

  1. This should be split in 2 commits (one for moving the start from service to script and one for update script changes) as they are functionally two different changes
  2. Commit logs need to be tweaked as we have for all the other commits (prepend component: )
  3. Add changelog entry

@pcarranzav pcarranzav force-pushed the 184-dont-rm-supervisor branch 3 times, most recently from 5e45953 to 79877f4 Compare January 17, 2017 16:43
Copy link
Contributor

@agherzan agherzan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but needs testing

@floion
Copy link
Collaborator

floion commented Jan 19, 2017

Something related was done as some point: #281
Maybe is worth revisiting

@pcarranzav pcarranzav force-pushed the 184-dont-rm-supervisor branch 2 times, most recently from 7c74ad4 to 15ddf28 Compare January 31, 2017 19:11
@agherzan agherzan changed the title Do not remove the supervisor container before every start [master] Do not remove the supervisor container before every start Feb 1, 2017
Pablo Carranza Velez added 2 commits February 1, 2017 13:18
…in every start

We add a start-resin-supervisor script that checks
whether the existing container's image matches the one specified
in supervisor.conf, and just starts that container in that case.
It falls back to the docker run command that creates and starts
the supervisor container.

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
…update script

Also, only write /tmp/supervisor.conf after the image has been successfully pulled and
the old container has been removed.

Signed-off-by: Pablo Carranza Velez <pablo@resin.io>
Copy link
Contributor

@agherzan agherzan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased over current HEAD>

@agherzan agherzan merged commit 76862c7 into master Feb 1, 2017
@agherzan agherzan deleted the 184-dont-rm-supervisor branch February 1, 2017 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants