Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chronos 2.5.1 fails to sustain registered state, flaps (ubuntu 14.04.05 LTS) #885

Open
jl-montes opened this issue Sep 30, 2018 · 0 comments

Comments

@jl-montes
Copy link

Installed new cluster on ubuntu 14.04.05 LTS masters and agents
zookeeper, mesos-master, chronos, and marathon services co-located on the same (3) servers
zookeeper, mesos-master, and marathon are running great, only chronos is having issues.

Chronos fails to maintain its registered status, it starts up, encounters errors, then exits, and the cycle start again. I keep seeing a reference to a Chaos HTTP-service that fails to start, what is this?
FATAL Failed to start HTTP service (mesosphere.chaos.http.HttpService

Have debugged by shuttng down chronos on 2 of 3 master-nodes and tailing logs, while watching startup.

Versions
linux kernel: 3.19.0-80-generic
mesos - 1.7.0-2.0.3
marathon: 1.7.37
chronos: 2.5.1-0.1.20171211074431.ubuntu1404
zookeeper: 3.4.5+dfsg-1

Detailed startup log is attached, below are snippets of messages that contain error, fatal or fail in the message

**
Sep 30 14:47:52 ubuntu01 chronos[6208]: [2018-09-30 14:47:52,561] INFO Opening socket connection to server ubuntu07/10.0.1.129:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn:975)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: [2018-09-30 14:47:53,872] FATAL Failed to start HTTP service (mesosphere.chaos.http.HttpService:23)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: [2018-09-30 14:47:53,882] ERROR Service HttpService [FAILED] has failed in the STARTING state. (com.google.common.util.concurrent.ServiceManager:775)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: [2018-09-30 14:47:53,925] FATAL Failed to start all services. Services by state: {RUNNING=[ZookeeperService [RUNNING], MetricReporterService [RUNNING], JobScheduler [RUNNING]], FAILED=[HttpService [FAILED]]} (org.apache.mesos.chronos.scheduler.Main$:45)
Sep 30 14:47:53 ubuntu01 chronos[6208]: [2018-09-30 14:47:53,960] ERROR Service JobScheduler [FAILED] has failed in the STOPPING state. (com.google.common.util.concurrent.ServiceManager:775)
[jmontes@jmserver home-playbooks]$ cat ~/Logs/chronos-startup-ubuntu01-Sep30-2018.txt | grep -i -e error -e fatal -e fail
Sep 30 14:47:52 ubuntu01 chronos[6208]: [2018-09-30 14:47:52,561] INFO Opening socket connection to server ubuntu07/10.0.1.129:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn:975)
Sep 30 14:47:53 ubuntu01 chronos[6208]: [2018-09-30 14:47:53,816] WARN FAILED com.google.inject.servlet.GuiceFilter-68b7d0ef: java.lang.TypeNotPresentException: Type javax.xml.bind.JAXBContext not present (org.eclipse.jetty.util.component.AbstractLifeCycle:204)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: [2018-09-30 14:47:53,850] WARN FAILED o.e.j.s.ServletContextHandler{/,null}: java.lang.TypeNotPresentException: Type javax.xml.bind.JAXBContext not present (org.eclipse.jetty.util.component.AbstractLifeCycle:204)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: [2018-09-30 14:47:53,852] WARN FAILED com.codahale.metrics.jetty8.InstrumentedHandler@14c71bb0: java.lang.TypeNotPresentException: Type javax.xml.bind.JAXBContext not present (org.eclipse.jetty.util.component.AbstractLifeCycle:204)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: [2018-09-30 14:47:53,856] WARN FAILED org.eclipse.jetty.server.handler.HandlerCollection@4abe76b9: java.lang.TypeNotPresentException: Type javax.xml.bind.JAXBContext not present (org.eclipse.jetty.util.component.AbstractLifeCycle:204)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: [2018-09-30 14:47:53,858] WARN FAILED org.eclipse.jetty.server.Server@6ade18f1: java.lang.TypeNotPresentException: Type javax.xml.bind.JAXBContext not present (org.eclipse.jetty.util.component.AbstractLifeCycle:204)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: [2018-09-30 14:47:53,872] FATAL Failed to start HTTP service (mesosphere.chaos.http.HttpService:23)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: [2018-09-30 14:47:53,882] ERROR Service HttpService [FAILED] has failed in the STARTING state. (com.google.common.util.concurrent.ServiceManager:775)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: #011at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Sep 30 14:47:53 ubuntu01 chronos[6208]: [2018-09-30 14:47:53,925] FATAL Failed to start all services. Services by state: {RUNNING=[ZookeeperService [RUNNING], MetricReporterService [RUNNING], JobScheduler [RUNNING]], FAILED=[HttpService [FAILED]]} (org.apache.mesos.chronos.scheduler.Main$:45)
Sep 30 14:47:53 ubuntu01 chronos[6208]: java.lang.IllegalStateException: Expected to be healthy after starting. The following services are not running: {FAILED=[HttpService [FAILED]]}
Sep 30 14:47:53 ubuntu01 chronos[6208]: [2018-09-30 14:47:53,960] ERROR Service JobScheduler [FAILED] has failed in the STOPPING state. (com.google.common.util.concurrent.ServiceManager:775)

@jl-montes jl-montes changed the title Chronos 2.5.1 fails to sustain registered state, flaps Chronos 2.5.1 fails to sustain registered state, flaps (ubuntu 14.04.05 LTS) Sep 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant