Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Depends_ON functionality during start process within docker-swarm #31333

Open
ozlatkov opened this issue Feb 24, 2017 · 47 comments
Open
Labels
area/stack area/swarm kind/enhancement Enhancements are not bugs or new features but can improve usability or performance.

Comments

@ozlatkov
Copy link

ozlatkov commented Feb 24, 2017

This is in regards to previous discussions which happened here in regards to Docker swarm mode within 1.13:

docker/compose#4305 (comment)

and here:

#30404 (comment)

and as per recommendation to move this in the docker/docker issue tracker.

As mentioned here I do understand the idea behind the "fault-tolerance" mechanism and that SWARM takes care to restart the container automatically.

However as explained in the first thread this is not quite the same concept when it comes to "initialization" where the "fault-taulerant" mechanism could (and actually does) make more issues than helping.
At a time when everything is completely stopped and while swarm starts the containers which have dependencies between each other it happen that every one get's restarted because the other one is still not running and so on, causing much more time lost during the startup instead of the opposite (faster boot).
The bigger problem is actually that the needed time ends up as a totally non-predictable value which makes it impossible to be planned during a maintenance window or to even have a rough idea in how much time one can restart/restore the infrastructure in case of a failure. So no scheduling at all.

Particularly if there are multiple containers (in my case 7 in number) with more dependency logic than just wait for a say single DB container (probably the trivial and most common case) the system could even enter into a kind of a "deadlock" state where one container gets restarted and during that another one because the first one is not available, but in the meantime the first is booted and sees the other one is not available and so on.
On my end I've waited for ~20 minutes and the swarm manager was keeping restarting but the correct order was still not identified.

In fact it really depends on the APP inside the container(s) - it really makes no sense to restart if the app actually takes more time to boot as this way startup would consume much more time for the start process. Instead if it follows a logic with the needed order the overall time would increase.

Having a mechanism like we used to have for clear "dependency" logic configuration would not make any issue of that kind at all + will actually make the boot process faster than relying on automatic restart and waiting until the proper order is get.

Without being able to predict, schedule during maintenance and knowing the behavior in general + at least approximately plan how much time the docker-swarm needs to start the real production operation would be close to impossible for any more-complex than the trivial (as expected to be "trivial" mentioned above) microservice environment which has more dependency logic within it.

@mustafaakin
Copy link
Contributor

mustafaakin commented Feb 24, 2017

Agreed, restarting forever and hoping it will work in a near future sometime is not a very good approach.

@thaJeztah thaJeztah added area/stack kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. labels Feb 27, 2017
@thaJeztah
Copy link
Member

/cc @dnephin

@dnephin
Copy link
Member

dnephin commented Feb 27, 2017

I don't understand what's being requested here. I think the application should retry instead of failing immediately when it can't establish a connection.

Agreed, restarting forever and hoping it will work in a near future sometime is not a very good approach.

You are free to define any restart policy you want under deploy.restart-policy (https://docs.docker.com/compose/compose-file/#/restartpolicy)

@vdemeester
Copy link
Member

/cc @aaronlehmann @stevvooe @aluzzardi on the swarmkit side, would it make sense (or not) to have some sort a dependency ?

@ozlatkov
Copy link
Author

@dnephin ,

Thank you.

The request is to have a clear dependency config which would allow at least predictability over the startup process.
As mentioned in the threads the "automatic restart" mechanism does make sense for "fault tolerance" for sure.
However when looking from initialization perspective that is not helpful at all, actually could increase the startup time a lot and is totally not predictable which makes it a real pain for maintain in a real and complex (with more than the trivial "just a single DB" case dependency) production environment.
One cannot predict any downtime because it is unknown how much time the actual "startup" process would take granted there might be tuning of the restart policy, etc.

Although for having an application level fix sounds reasonable, it's still:

  1. Quite hard to implement just right as it is because dependencies have been thrown away just like that - makes docker upgrade much more harder requiring an application level fixes (when there are multiple containers and apps) - close to impossible in the short/medium run.
  2. There are cases in which application level fix is quite hard as well - really depends on how the microservices are linked and spread around different containers.

Again - this is purely for "initialization" phase which definitely can benefit from something which we used to have.

@stevvooe
Copy link
Contributor

stevvooe commented Mar 2, 2017

Agreed, restarting forever and hoping it will work in a near future sometime is not a very good approach.

Actually, this is step one in building self-healing infrastructure. While one needs to mitigate this approach with visibility and failure reporting, for the most part, making each part of your application resilient to failure

@ozlatkov depends_on is an extremely bad design in the pursuit of fault tolerant, distributed systems. The biggest problems are cascade failures caused by transitive blips. The propensity for deadlock scenarios and requirement for manual intervention can also create more operation complexity.

Under naive analysis, dependency analysis is a fantastic approach. When you starting looking at state detection hysteresis for distributed systems, under both startup and failure, dependency analysis becomes very problematic. In classic dependency analysis systems, such as make, when you build a target, the likelihood of that target no longer being satisfied after resolution is almost zero. In a distributed systems scenario, this is no longer true.

In a distributed system, ascertaining failure of a service may under go delay or just be plain out of date. SwarmKit goes to great lengths to reduce this delay but we are limited by theory. The root of the issue comes from killing or delaying the startup of otherwise healthy processes based on the failure of another.

At a time when everything is completely stopped and while swarm starts the containers which have dependencies between each other it happen that every one get's restarted because the other one is still not running and so on, causing much more time lost during the startup instead of the opposite (faster boot).

Don't fail your services when their upstream is down. Backoff and report the failure (503 or equivalent) upstream. This will allow your services to respond to intermittent failure and make them more resilient under transient failure. Circuit breaker and exponential backoff are important techniques. There are also tricks that can be used to reduce resolution time, such as capping exponential backoff or issuing test probes.

Employing these techniques will make your application both quicker to startup and quicker to recover under transient failure.

the system could even enter into a kind of a "deadlock" state where one container gets restarted and during that another one because the first one is not available, but in the meantime the first is booted and sees the other one is not available and so on.

This is a great argument for building a fault tolerant application that doesn't fail on upstream errors. Err on keeping service up and the application will converge towards being up.

Quite hard to implement just right as it is because dependencies have been thrown away just like that - makes docker upgrade much more harder requiring an application level fixes (when there are multiple containers and apps) - close to impossible in the short/medium run.
There are cases in which application level fix is quite hard as well - really depends on how the microservices are linked and spread around different containers.

If you think these are hard at the application level, imagine trying to implement them to start and fail correctly in a generic scheduler that has less visibility into the state of the system than each individual service. Even worse, the swarmkit orchestrator is going have less information with more delay, so even if it could make solid decisions, they won't be nearly as good as what the consuming service can do. First party observability is a valuable thing.

In other words, the very same startup problems that you are complaining are not implemented in depends_on apply to the orchestrator. Your assumption that the swarm orchestrator can make a better assessment of the state of an upstream service is inaccurate. The orchestrator will only know if a service is running and might have an indication of what ports are available. Using application specific failure reporting will allow you to be much more flexible in the face of arbitrary failure.

Note that these techniques need to be balanced with observability. When an upstream service fails, the consumer needs to be able to report that failure forward or through monitoring. Such an approach will keep the system up, observable and resilient.

I hope this helps. Let me know if you need clarification.

@patran
Copy link

patran commented Mar 17, 2017

Just for the startup scenario, how about letting a service A specify a URL and don't attempt to start A until the specified URL returns a, say, 200?

@stevvooe
Copy link
Contributor

Just for the startup scenario, how about letting a service A specify a URL and don't attempt to start A until the specified URL returns a, say, 200?

@patran I'm going to assume that you missed part of #31333 (comment), but I touched on this point. The main problem is that the behavior of "the url" as observed by the orchestrator may be different than as observed by the service. They may be on different networks or have different connectivity paths that affect the result, or there may be a delay for observable facts of the system. Effectively, you have a first party observer (the service) asking a third party decision maker (the orchestrator) to make decisions about service state based on out of date information. The behavior of the service would be erratic resulting in unnecessary failures and downtime.

As I said above, the much better approach is to move these checks into your application where they can make appropriate application-level decisions, integrated with your business logic. The result is a more-reliable, quicker converging application that is resilient to transient upstream failure both at startup and at steady state.

If I missed something in the above that didn't connect why "adding a URL" would cause just as many problems, please let me know where I can clarify.

@patran
Copy link

patran commented Mar 20, 2017

@stevvooe , you are probably way overthinking it.

From the orchestrator perspective, if the specified URL returns, say, a 200, it means it is good to go. There is no other perspective and interpretation. The contract could be informal such as say a HTTP response code or the contract could be more formal.

docker-compose.yml says don't start service B until service A "ok-url" (call it whatever), returns a code, say 200.

That's a simple contract that would make it incredibly more easy to use docker-compose.yml / docker swarm.

@patran
Copy link

patran commented Mar 20, 2017

Please note the use of the URL is optional. The person putting together docker-compose.yml can choose to use it or not to use it. Along the same line, the owner of Service A can choose to publish such a URL or not.

The URL does not have to be tied to Service A. It could be any URL. Ie. don't start this service X unless the specified URL (or even URLs :) ) return 200

@stevvooe
Copy link
Contributor

@patran I'm unsure how your proposal addresses the bulk of issues described above. These are based on limitations inherent to a distributed system, as well as deep experience in deploying them. Dismissing these issues as "over-thinking" when they are well-known and observed, in practice, is ultimately reductive.

From the orchestrator perspective, if the specified URL returns, say, a 200, it means it is good to go. There is no other perspective and interpretation.

I think if you start from this perspective, you will likely miss a number of conditions in which your assumptions are partially incorrect or entirely false. We can ask a few questions about the problem to see it becoming hard as soon as we move the observer outside of the first party consumer:

What if it doesn't return a 200?
How many requests should it send before considering the service to be up and to start the other one?
What if it returns a 200 then immediately goes down?
What if the request never returns?
What if 200 is returned to the orchestrator, but the consuming services can't observe that the service is up?

In many cases, we risk killing a perfectly healthy app just due to the time delay. Other times, we get an application reported as healthy, when it is not.

In general, we can classify these as "problems with observability". Be it time or locality, we always will have trouble observing the correct state of the system. One technique to addressing this general problem is to move the decision to part of the system where the observation has impact. This is why you have systems like haproxy directly doing health checks on upstream proxies; they can directly observe whether they can connect to an upstream and they observation impacts their behavior. Once you defer this to another (say, an orchestrator) party, the problem becomes an order of magnitude harder because we now need to project these details into the new observer.

Again, pushing this behavior into your application will always yield the best result. It can observe the state of the system first hand and make the most optimal decision.

@patran
Copy link

patran commented Mar 21, 2017

@stevvooe, as in my original post, I am interested in only "Just for the startup scenario...". Given a list of services A, B, C, D, etc., it would be very convenient, at the time the stack is being brought up, to be able to say, don't start so and so service until these so and so conditions are true. The condition check could be as simple as a URL call or even just a successful socket connection.

As far as the list of What ifs and How, I trust you will find good defaults. If there are 2 equally good choices, well, pick a default and/or provide options :)

Quick thought off of the top of my head: before a service is started, there are checks for resources such as CPU, mem, networks, etc, right? This extra URL check could be just that, just another check, in addition to the built-in checks, but controlled and possibly customizable by the user.

@stevvooe
Copy link
Contributor

@patran You are missing the point: the same constraints that apply to the steady state scenario also apply to the startup state scenario. One must solve the same problems in either case. While I agree a URL would be convenient, it would be very inconvenient to have the system be easily configurable to exhibit cascade failure under any failure condition. Error amplification is a real thing and generally comes from tight coupling, such as that proposed here.

Given your response, I am guessing that you are completely misunderstanding the constraints present in a distributed system. I would suggest that you give such an implementation a try to better understand the inherent complexities of making decisions based on delayed state. It is an open problem and most solutions end up being "best effort", meaning their behavior is undefined in certain conditions.

If you assert that this is simple to implement, adding such a constraint to running service at the application level should be even easier, and heeds the advice of moving detection to the application. A simple wrapper script would achieve the result:

#!/bin/bash

set -e

curl --connect-timeout $TIMEOUT -f http://<dependent>/

exec $ENTRYPOINT

More complex startup logic could be added to your application, as well.

The other option is to health check upstream services from your application, rather than blocking startup. This will allow you both to notify your users of errors and observe errors in your monitoring system while ensuring fast recovery.

@patran
Copy link

patran commented Mar 21, 2017

@stevvooe, it's cool we agree it is a convenient feature. I'd say if it is a convenient feature then it is a desirable feature and the end goal ought to be to support it. The rest of the debate, whether we agree or not agree on the level of complexity and scope, is a matter of implementation and roadmapping.

@Silex
Copy link

Silex commented Mar 21, 2017

It feels a bit like we are running in circles.

I think @stevvooe made great points and showed that wanting convenience gets in the way of scalability.

To me the problem is more that the docker tools blur the lines between running containers locally or running containers in a distributed manner. These are and should be treated as different scenarios... but often people are in the "local" mindset and jump to the "distributed" one and have expectations.

Locally, it kinda makes sense to wait on a service before starting another (and we are used to it, see init systems), and it does not hurt much to have strong coupling between them... because the number of containers is low and is easily manageable (because you know the system state easily). In this scenario it's convenient that the tools support dependencies based on healthchecks, like in compose file version 2.1 (see docker/compose#374 (comment)).

In a distributed environment, it's much harder to be sure of the system state, and it can quickly become a dependency nightmare so it's better to create resilient/self-healing containers, that just do their work when they can and are generally idempotent. I know it's not exactly the same, but I'd like to reuse Sidekiq's explanation from here: https://github.com/mperham/sidekiq/wiki/Best-Practices#2-make-your-job-idempotent-and-transactional

@thaJeztah
Copy link
Member

Thanks @Silex, I think that's a proper description of the differences. There's other features that make sense for local development, but (generally) not for deployment, e.g. bind-mounting source-code into the container.

I'm not sure what the best solution is to facilitate both scenarios; raising expectations that cannot be fulfilled when actually deploying an application ("it worked in dev, but fails in prod") may be bad. (I can see cases where people use docker just for development environments, in which case some features may be useful).

@stevvooe
Copy link
Contributor

@Silex Thank you for providing another perspective here! I think you've captured the crux of the issue: though we've bridged the UX, we can't necessarily change the properties of the universe in which we operate. Many of the properties of a system we consider absolute disappear as you "stretch time".

The description in the sidekiq wiki is very apt in this scenario. It is a great example where pushing down requirements to the application can affect its scaling properties immensely. An introduction to distributed systems is also good overview to understands some of the problems and tradeoffs when you start scaling an application.

Unfortunately, without a good understanding of these topics, the Dunning-Kruger effect dominates the conversation. It is all too easy to dismiss concerns when we don't understand their reality or operate from a position of intellectual dishonesty that prevents that understanding.

I'm not sure what the best solution is to facilitate both scenarios; raising expectations that cannot be fulfilled when actually deploying an application ("it worked in dev, but fails in prod") may be bad.

I think this issue has captured my main concerns with the feature, albeit we may want to document some of the scenarios in more detail (cascade failure, startup deadlock, failure amplification, etc.). The result I would like to avoid here is to rush out and introduce a feature that makes it easy to push unstable applications. Without careful analysis and design, an under-engineered depends_on could result in less reliable applications and an endless train of patches to address gaps in the theory.

There is a solution that may center around "push down" failure detection, where we push the upstream checks down into the agent process. However, I am not sure how that could be nearly as sophisticated as a bespoke solution tuned for the application (failures really are business logic, at the end of the day). Ultimately, the killing a Swarm-mode task is a very clumsy response when all you really need is to report an error code in your application, making it very hard to find a solution which doesn't require cooperation with the application. While the "push down" solution is the most realistic, many details would have to be worked out to prevent massive scope creep and over-coupling.

We should probably document solutions with popular application frameworks, like the bash-curl approach I suggested above, hysterix or techniques that should be employed to ensure fast startup and fast recovery. Some of these approaches are remarkably straightforward for the effect they can have on reliability. Education, rather than features, is going to be the key to democratizing distributed systems.

@djui
Copy link

djui commented Mar 24, 2017

I agree with @stevvooe that in production we want a fairly resilient approach. For development, @thaJeztah and @patran, you might want to have a look at https://github.com/betalo-sweden/await which can help you simulate depends_on.

@ShT3ch
Copy link

ShT3ch commented Apr 3, 2017

There is other side of no-depends_on problem for me.
When I stop my app.
There the case: I have DB, ZMQ broker and workers. At the stopping signal worker flushes data to DB via ZMQ. So, orchestrator should not stop DB and ZMQ broker until all workers are stopped.

@teohhanhui
Copy link

#28938 should alleviate some of the woes...

@laugimethods
Copy link

laugimethods commented Apr 19, 2017

@stevvooe, you are right that all services should be self-resilient and should be able to (re)start at any time, but in an ideal world.

I do have to deploy interrelated services (Cassandra, Spark, NATS, etc.) based on Docker Images not designed that way. So, I have no choice but to implement some orchestration between those services at the initialization phase. Therefore, removing that feature in Docker Swarm doesn't really 'educate me', only require more work to do so in a ad hoc way.

@Silex
Copy link

Silex commented Apr 20, 2017

@laugimethods: I think in this case you are supposed to wrap your entrypoint/cmd with some helpers (or make a new image and wrap there), effectively making them self-resilient.

Utilities like https://github.com/betalo-sweden/await or https://github.com/vishnubob/wait-for-it and some bash loop should be a good start.

@panga
Copy link

panga commented Apr 20, 2017

We use bash hacks (wait, while, etc.), It's a ugly hack, but it doesn't always works.
The problem is that Swarm restart the container before it's healthy!

What about implementing a simple depends_on healthy approach like docker-compose v2.1?

BTW Mesos/Marathon and Amazon ECS has this feature :)

@galindro
Copy link

I totally agree with @laugimethods

@daggerok
Copy link

daggerok commented Sep 28, 2017

Actually, this is step one in building self-healing infrastructure. While one needs to mitigate this approach with visibility and failure reporting, for the most part, making each part of your application resilient to failure

@stevvooe, actually it's not and what if It's not possible? fail-fast or not - this is just an option, depends on your needs, but this is not a rule.

for example two variants for polyglot cloud architecture:

  1. in case microservice built on top of spring-boot + spring-data - will fails if needed rabbitmq broker on mongodb connection are not exists, because java bean initialized as singleton on application context bootstrap - in this case on fail my app will restart until all needed backing-services will be available. (so we solved it with on-failure restart policy condition)

  2. but in case of node.js + servicebus + mongoose microservice - previous approach will not work at all, because app will run with no failures even if required db and message broker are not available yet, but created connections might be useless until I manually create them

obvious, I need some standard approach, like it was provided with healthcheck and docker-compose v2.1 depends_on, to handle these needs in same fashion, no matter what kind of polyglot microservice or any other backing-services I need to verify... but now I will have to provide some hacky workarounds for every different polyglot (like some custom retry mechanism or fail-fast) because of needed containers health-functionality does not exists anymore

sad...


Regards,
Maksim

@stevvooe
Copy link
Contributor

At this point, I would suggest that you put together a proposal in swarmkit for how to actually implement this functionality. There are some suggestions above that could be explored. If someone wants to take this on, their more than welcome and we can help to figure out what to do.

@dead10ck
Copy link

@stevvooe so is this Docker's official stance on this? "Fix it yourself"?

@thaJeztah
Copy link
Member

thaJeztah commented Sep 29, 2017

@dead10ck please keep your comments constructive; there's no need to be offensive. By participating on this issue tracker, we expect you to follow the community guidelines, and code of conduct.

Consider this an official warning

This is an open source project; open source evolves around participation and contributing, not demanding a feature to be implemented.

Implementing this feature for swarm services is non-trivial, @stevvooe's comment above is an invitation for people to participate, a request to help design how this feature could be implemented.

If you're using Docker Compose, and need this feature for local development only; stick to the 2.x file format for now, which is still supported, and still has this feature.

@dead10ck
Copy link

@thaJeztah I'm not demanding anything. I consider it very constructive to know if we can realistically expect this to be prioritized by the Docker team or if we'll just need to choose between continuing to rely on hacks or never being able to take advantage of the benefits of the v3 compose format. This helps me figure out how to move forward with my projects.

Also, it doesn't necessarily even need to be implemented for Swarm yet. It could be implemented for local containers only and be a noop in Swarm mode, or issue an error/warning.

Also, if you're going to be citing the code of conduct, I'd recommend making sure the Docker team is aware of it as well. @stevvooe has been blatantly offensive in this thread.

@thaJeztah
Copy link
Member

I'm not demanding anything. I consider it very constructive to know if we can realistically expect this to be prioritized by the Docker team

Ok, you're asking for (some company) to prioritise / invest engineering time to implement the feature you want.

The short answer: no, to my knowledge, this is currently not prioritised

The slightly longer answer: various individuals and companies provide engineering time to help maintain these projects (the maintainers); part of maintaining a project is to work together with contributors on implementing new features and enhancements, discover (and hopefully resolve) bugs, review code contributions and proposals. If this change is important to you, we welcome you, or your company to make a similar investment to help realising the feature or find a solution.

Also, it doesn't necessarily even need to be implemented for Swarm yet. It could be implemented for local containers only and be a noop in Swarm mode, or issue an error/warning.

That would be a decision for the Docker Compose maintainers; IIRC, support for extension attributes (x-foobar:) was recently added to the compose-file. Adding, e.g. an x-depends_on: option could be something to consider (as a stop-gap solution); this allows the feature to be "functional", but prevents it from ending up in the specs before the design is completed (which may not be a 1-1 translation of the depends_on option).

Can you open a proposal in the https://github.com/docker/compose repository for that, so that the maintainers there can have a look?

Also, if you're going to be citing the code of conduct, I'd recommend making sure the Docker team is aware of it as well. @stevvooe has been blatantly offensive in this thread.

If I have overlooked language in his comments that you found to be offensive to you as a person, or to others, I apologise; feel free to reach out to me at sebastiaan(at)docker.com out of band (trying to keep the conversation on topic).

@daggerok
Copy link

daggerok commented Sep 30, 2017

@stevvooe I'd like to, but not sure If I can - I'm not experienced with go.. But I would like to try. Can you please point me where I can start from? Thanks.

PS: for now I'm doing testing with such ugly custom bash scripts like:

# 1. init swarm, build everything, push images to local registry and deploy stack

# 2. cranky wait-for swarm services..
for service_name in my-stack_service-1 my-stack_service-2; do
  state=$(docker stack services --filter name="$service_name" --format="{{.Replicas}}" my-stack)
  while [ "$state" != "1/1" ]; do
    docker service scale --detach=true "$service_name"=1
    state=$(docker stack services --filter name="$service_name" --format="{{.Replicas}}" my-stack)
    sleep 3
  done
done

# 3. integration testing, cleanup

or if i know my nodejs services should be bootstrapped requires db/mq, but they are starting before it, I'm scaling them down to zero, and after starting scale one-by one in correct order with docker service scale --detach=false ...


Regards,
Max

@djui
Copy link

djui commented Oct 1, 2017

@daggerok If you are looking for a tool to await availability of starting resources, independent of docker compose or docker swarm, I can recommend https://github.com/betalo-sweden/await

I use it in docker compose but I believe it can work with docker swarm as well.

Until this feature sits as close as possible to the orchestrator (docker, docker compose, docker swarm), having a unified and consistent fallback is an ok option over tailored scripts, imho.

@daggerok
Copy link

daggerok commented Oct 1, 2017

@djui thanks, will look on it

@laugimethods
Copy link

@daggerok You might also have a look at a framework I developed: https://hub.docker.com/r/logimethods/eureka/

@stevvooe
Copy link
Contributor

stevvooe commented Oct 2, 2017

@daggerok In general, this feature would likely be a part of the worker. Pulling the problem up to the orchestrator would have it acting on third party information, meaning it is out of date or inaccurate. One would configure the service dependencies and then the correct instructions would be injected into the worker. Most of the work here is defining the necessary instructions that can be used as primitives for implementing the higher-level feature set.

There seem to be two interesting requests in this area:

  1. Don't start until service X has started, at the Start part of Controller lifecycle.
  2. Don't stop until service Y has shutdown, at the Shutdown part of the Controller lifecycle.

Note that I have left out the use case of online or steady state dependency handling, as that doesn't seem to be causing the most pain. I also think we can build it out based on the primitives we define for startup and shutdown. If there are other graph relationships we need to represent that don't fall into these, please mention them now, as we'll need to represent them to ensure we can handle all use cases. The key to stability here will be focusing on conditions that are representative and lack noise. Usability will be dependent on making it easy to identify the operating region that can reduce false positives for startup delay and shutdown conditions. This means that we need visibility into the timeout and heartbeat parameters for such primitives.

I think the first step would be attempting to inject startup conditions into the task definition: https://github.com/docker/swarmkit/blob/master/api/objects.proto#L162. The Task is owned and controlled by the orchestrator, where as the TaskSpec takes user input. You can read a little bit about the task handling model in https://github.com/docker/swarmkit/blob/master/agent/exec/controller.go#L16. For pre-start dependencies, you'd want to hook into Start method on each executor implementation. The dockerapi version is here: https://github.com/docker/swarmkit/blob/master/agent/exec/controller.go#L16. Note the healthcheck handling that is present: this might provide guidance on how to do this.

One key distinction here is that the check must be executed from the container or the network namespace for it to be useful. If you try it from outside the container, the connectivity available will differ and won't represent the actual service graph. This why each executor implementation in swarmkit will need to implement their own functionality here, as this is going to be different in each.

For this implementation spike, I'd suggest focusing on the primitives required for use case 1. I think they can be generalized for both case 2 and the steady state case. Once that option is explored, introducing case 2 (that has a few more challenges) and the steady state case should start to show where the UX should head to make defining these dependencies easy.

Hopefully, this is enough to get started. Let me know if I can clarify the approach.

@daggerok
Copy link

daggerok commented Oct 3, 2017

thanks @laugimethods, I will look on it

thank you @stevvooe for your details


Regards,
Maksim

@thaJeztah thaJeztah removed this from backlog in maintainers-session Nov 16, 2017
@rueberger
Copy link

I would like to add by 2c as the helmsman of an early-stage startup: we currently do not have the resources to go fully distributed but are architecting everything to allow for that in the future. Which basically means using single-node setups of all of our backend services.

docker compose itself is insufficient for this purpose as it does not support restarting failed containers, so docker swarm it is. Nothing distributed here though, a slightly unreliable setup to work out the core application logic is perfectly acceptable. In this context depends_on is a highly desirable feature.

I am currently running into a problem where kafka fails to ever start successfully if kafka and zookeeper are killed at the same time that would be entirely resolved by depends_on, and considering all sorts of nasty hacks to get around this.

Is it feasible to support depends_on for single-node swarms only?

@thaJeztah
Copy link
Member

docker compose itself is insufficient for this purpose as it does not support restarting failed containers

While docker compose itself won't monitor containers, you can still set a restart policies to make the docker daemon restart containers that exit; but note that this is different than swarm services, where a new task (container) is created to replace the failing one.

Is it feasible to support depends_on for single-node swarms only?

Unfortunately, implementing for a single-node swarm would be the same as for a multi-node swarm; swarm works exactly the same in both situations and, other than containers always being on the same node, and the manager running on the same node as the worker, there's not much difference (from a technical perspective).

I am currently running into a problem where kafka fails to ever start successfully if kafka and zookeeper are killed at the same time that would be entirely resolved by depends_on

In docker compose's implementation, depends_on is mainly useful to influence startup order ; once the stack is running, docker compose itself is out of the picture, and depends_on won't help. In contrast, swarm services are monitored throughout their entire lifetime; what should happen if (in your situation) the zookeeper fails; should services that depend on zookeeper be killed, and new instances be created after zookeeper is up and running again? This could work for some use-cases, but also easily lead to a cascading effect in other use-cases.

So, best starting point would be to have a design proposal for how this feature should work (#31333 (comment)); that design could look different from the docker compose implementation (and perhaps should not be named depends_on if startup and teardown order should be configurable separately - in cases where only startup order is important)

@sebthom
Copy link
Contributor

sebthom commented Nov 1, 2019

I created some POSIX shell scripts that can be mounted into containers. They can be used to wait for TCP ports or HTTP services during startup. https://github.com/vegardit/await.sh#swarm

@hinorashi
Copy link

I leave my penguin friend here -->🐧 <-- he couldn't help with the implementation but he would make everyone happy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/stack area/swarm kind/enhancement Enhancements are not bugs or new features but can improve usability or performance.
Projects
None yet
Development

No branches or pull requests