Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

taskKillGracePeriodSeconds doesn't work #4338

Closed
robfrut135 opened this issue Sep 9, 2016 · 13 comments
Closed

taskKillGracePeriodSeconds doesn't work #4338

robfrut135 opened this issue Sep 9, 2016 · 13 comments

Comments

@robfrut135
Copy link

Hi,
I'm using Mesos 1.0.1 and Marathon 1.3.0-RC6, when I put "taskKillGracePeriodSeconds", app not deploy on Marathon, not take effect. From my view, Marathon UI not recognise this field.

@unterstein
Copy link
Contributor

Hi @robfrut135,

I retested this a couple of seconds ago and I could configure an app with taskKillGracePeriodSeconds through the UI which is recognized and passed correctly. Could you please paste your app definition?

Thanks
Johannes

@robfrut135
Copy link
Author

Hi,
I followed this steps:

From UI:

captura de pantalla de 2016-09-09 18 52 09

captura de pantalla de 2016-09-09 18 37 08

captura de pantalla de 2016-09-09 18 37 24

JSON from API:

captura de pantalla de 2016-09-09 18 42 33

Here the full JSON (remove some internal data)

{ "app": { "id": "/genesis/dev/plaid", "cmd": null, "args": [ "kst-genesis/plaid/dev" ], "user": null, "env": { "CONSUL": "consul.service.consul:8500" }, "instances": 1, "cpus": 0.2, "mem": 512, "disk": 0, "gpus": 0, "executor": "", "constraints": [ [ "hostname", "UNIQUE" ] ], "uris": [ ], "fetch": [ ], "storeUrls": [ ], "backoffSeconds": 5, "backoffFactor": 2, "maxLaunchDelaySeconds": 60, "container": { "type": "DOCKER", "volumes": [ ], "docker": { "image": "genesis-plaid:0.0.2", "network": "BRIDGE", "portMappings": [ { "containerPort": 9000, "hostPort": 0, "servicePort": 10011, "protocol": "tcp", "labels": { } } ], "privileged": false, "parameters": [ ], "forcePullImage": true } }, "healthChecks": [ { "path": "/v1/plaid/health", "protocol": "HTTP", "portIndex": 0, "gracePeriodSeconds": 30, "intervalSeconds": 20, "timeoutSeconds": 20, "maxConsecutiveFailures": 5, "ignoreHttp1xx": false } ], "readinessChecks": [ ], "dependencies": [ ], "upgradeStrategy": { "minimumHealthCapacity": 1, "maximumOverCapacity": 1 }, "labels": { "subtopic": "plaid", "tags": "genesis,plaid,aggregators", "version": "v0.0.3", "topic": "genesis" }, "acceptedResourceRoles": null, "ipAddress": null, "version": "2016-09-08T11:45:31.503Z", "residency": null, "secrets": { }, "taskKillGracePeriodSeconds": null, "ports": [ 10011 ], "portDefinitions": [ { "port": 10011, "protocol": "tcp", "labels": { } } ], "requirePorts": false, "versionInfo": { "lastScalingAt": "2016-09-08T11:45:31.503Z", "lastConfigChangeAt": "2016-09-08T11:45:31.503Z" }, "tasksStaged": 0, "tasksRunning": 1, "tasksHealthy": 1, "tasksUnhealthy": 0, "deployments": [ ], "tasks": [ { "id": "genesis_dev_plaid.c23547d8-75b9-11e6-9376-069d0696a105", "slaveId": "63aa66c3-527e-46ec-b1b2-4b7f69f96c46-S0", "host": "xxxxxxx", "state": "TASK_RUNNING", "startedAt": "2016-09-08T11:45:38.073Z", "stagedAt": "2016-09-08T11:45:36.905Z", "ports": [ 31064 ], "version": "2016-09-08T11:45:31.503Z", "ipAddresses": [ { "ipAddress": "172.17.0.9", "protocol": "IPv4" } ], "appId": "/genesis/dev/plaid", "healthCheckResults": [ { "alive": true, "consecutiveFailures": 0, "firstSuccess": "2016-09-09T06:28:37.964Z", "lastFailure": null, "lastSuccess": "2016-09-09T16:42:33.056Z", "lastFailureCause": null, "taskId": "genesis_dev_plaid.c23547d8-75b9-11e6-9376-069d0696a105" } ] } ], "lastTaskFailure": { "appId": "/genesis/dev/plaid", "host": "xxxxxxxxxxxxx", "message": "Task was killed since health check failed. Reason: 404 Not Found", "state": "TASK_KILLED", "taskId": "genesis_dev_plaid.9113d602-75b7-11e6-9376-069d0696a105", "timestamp": "2016-09-08T11:31:50.282Z", "version": "2016-09-08T11:11:08.935Z", "slaveId": "63aa66c3-527e-46ec-b1b2-4b7f69f96c46-S0" } } }

I hope that info to be useful.

@robfrut135
Copy link
Author

What about this issue? Have you could reproduce it?

@beeva-robertofrutos
Copy link

I updated to marathon 1.3.0 release. The issue keeps on failing.

@unterstein
Copy link
Contributor

Yes I could reproduce it that taskKillGracePeriodSeconds field is not recognized in marathon UI.

//cc @wavesoft

@unterstein unterstein added bug and removed analyze labels Sep 20, 2016
@ryanprayogo
Copy link

ryanprayogo commented Oct 19, 2016

taskKillGracePeriodSeconds didn't seem to work for me when app is created using the POST /v2/apps endpoint as well.

@unterstein How did you end up configuring your app with taskKillGracePeriodSeconds?

@tgermain
Copy link
Contributor

tgermain commented Oct 26, 2016

According to this doc : https://mesosphere.github.io/marathon/docs/health-checks.html

taskKillGracePeriodSeconds is a label of an application. cc @robfrut135

@ryanprayogo
Copy link

Which is what he has done, is it not, @tgermain ?

FWIW, this is my app definition:

{
  "id": "/test-app",
  "cmd": null,
  "cpus": 1,
  "mem": 256,
  "disk": 0,
  "instances": 1,
  "constraints": [
    [
      "hostname",
      "GROUP_BY"
    ]
  ],
  "container": {
    "type": "DOCKER",
    "volumes": [],
    "docker": {
      ...
    }
  },
  "healthChecks": [
    {
      "path": "/health",
      "protocol": "HTTP",
      "portIndex": 0,
      "gracePeriodSeconds": 90,
      "intervalSeconds": 60,
      "timeoutSeconds": 2,
      "maxConsecutiveFailures": 3,
      "ignoreHttp1xx": false
    }
  ],
  "portDefinitions": [
    {
      "port": 10041,
      "protocol": "tcp",
      "labels": {}
    }
  ],
  "args": [
    ...
  ],
  "taskKillGracePeriodSeconds": 10,
  "backoffFactor": 2,
  "upgradeStrategy": {
    "minimumHealthCapacity": 1,
    "maximumOverCapacity": 0.5
  }
}

@tgermain
Copy link
Contributor

It is, I misread the doc :

To set the necessary grace period, add the taskKillGracePeriodSeconds label to your application definition:

"taskKillGracePeriodSeconds": 10

It's a appDefinition key (like in json key/value) not one of the labels of you appDefinition.

marathon rest API official doc is correct.

A complete appDefinition example would be a great addition to the doc to disambiguate the last part. I'll do that.

@bakstad
Copy link

bakstad commented Dec 14, 2016

Any progress on this issue? This is becoming a problem for us.

@jdef
Copy link
Contributor

jdef commented Dec 14, 2016

/cc @orlandohohmeier sounds like a UI bug?

@wavesoft
Copy link
Contributor

wavesoft commented Dec 14, 2016

So, the taskKillGracePeriodSeconds is an unknown property to the UI, so when filtering takes place, it's removed. Creating a new service should work without any issue, but when editing this value will be completely ignored.

The only solution currently is to use the API for editing, but I will check if we can back-port the fix on the 1.1.5 version of the UI.

@meichstedt
Copy link
Contributor

Note: This issue has been migrated to https://jira.mesosphere.com/browse/MARATHON-1715. For more information see https://groups.google.com/forum/#!topic/marathon-framework/khtvf-ifnp8.

@mesosphere mesosphere locked and limited conversation to collaborators Mar 27, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants