Skip to content

v1.6.549

Compare
Choose a tag to compare
@timcharper timcharper released this 20 Oct 00:33
· 574 commits to master since this release
aabf743

Change from 1.6.352 to 1.6.549

New Exit Codes

Marathon will indicate with an exit code why it stopped itself. See the docs page for a list of all codes and their meanings.

Native Packages

We have stopped publishing native packages for operating system versions that are past their end-of-life:

  • Ubuntu Yakkety
  • Ubuntu Wily
  • Ubuntu Vivid

Additionally, we have added support for Debian Stretch.

Limit maximum number of running deployments

New command line flag --max_running_deployments was added to limit the max number of concurrently running deployments. The default value is set to 100. Should the user try to submit more updates than set by this flag a HTTP 403 Error is returned with an explanatory error message. We introduced this flag because having lots of running deployments can lead to a significant performance decrease in the failover scenario during marathon initialization phase. Note that if you reach the maximum deployment number, you will have to use ?force=true parameter to cancel an existing deployment.

Zookeeper storage compaction interval

New command line flag --storage_compaction_interval was added to set zookeeper storage compaction interval in seconds. The default value is set to 30 seconds.

Deprecation Mechanism

Marathon has gained a new feature flag: --deprecated_features. For more information, see the docs.

Non-blocking API and Leader Proxying

Previously, when under substantial load, Marathon would time out a deployment initiating request (such as modifying an app) after some time, with "futures timed out". The timeout was not very helpful because Marathon would perform the work requested, regardless. This timeout has been removed. However, note that the client will time out if configured to do so.

To handle the potential increase in concurrent connections, deployment operations and leader request proxying now use nonblocking I/O. The nonblocking I/O proxying logic may have some subtle differences in how responses are handled, including more aggressive rejection of malformed HTTP requests. In the off-chance that this causes an issue in your cluster, the old behavior can be restored with the command line flag --deprecated_features=sync_proxy. sync_proxy is scheduled to be removed in Marathon 1.8.0.

Improved environment variable to command line argument mapping

As part of the fix for MARATHON-8254, the logic for receiving command-line options from environment variables has been reworked. "*" is properly propagated (previously, the glob-expanded result was getting passed), and spaces and new-lines are now preserved.

There's a small change in behavior for environments in which the launcher script is sourced, rather than executed. Unexported environment variables will not be converted in to parameters.

Optionally allow offer suppress

Marathon can now be configured to suppress offers from Mesos by specifying the flag --suppress_offers. This can improve offer-starvation scenarios in larger clusters at the cost of reservations taking longer to destroy. This is off by default.

New Metrics

Several new metrics have been added to improve detection of load-scenarios known to degrade Marathon's performance:

  • mesosphere.marathon.api.HTTPMetricsFilter.gzippedBytesWritten
  • mesosphere.marathon.api.HTTPMetricsFilter.bytesRead
  • mesosphere.marathon.api.HTTPMetricsFilter.bytesWritten
  • mesosphere.marathon.core.deployment.impl.DeploymentManagerActor.currentDeploymentCount
  • mesosphere.marathon.core.deployment.impl.DeploymentManagerActor.deploymentCount
  • mesosphere.marathon.core.flow.impl.ReviveOffersActor.reviveCount
  • mesosphere.marathon.core.flow.impl.ReviveOffersActor.suppressCount
  • mesosphere.marathon.core.group.impl.GroupManagerImpl.dismissedDeployments
  • mesosphere.marathon.core.group.impl.GroupManagerImpl.queueSize
  • mesosphere.marathon.core.matcher.base.util.OfferOperationFactory.launchOperationCount
  • mesosphere.marathon.core.matcher.base.util.OfferOperationFactory.launchGroupOperationCount
  • mesosphere.marathon.core.matcher.base.util.OfferOperationFactory.reserveOperationCount

Deprecated Features

/v2/schemas

The route /v2/schemas has been deprecated in favor of the RAML specifications. Clients that need to perform local validation of requests can access the RAML specifications with the prefix the /public/api. For example, to get the RAML definition for the apps resource, GET http://marathon:8080/public/api/v2/apps.raml.

The route /v2/schemas has the following deprecation schedule:

  • 1.6.x - /v2/schemas will continue to function as normal.
  • 1.7.x - The API will stop responding to /v2/schemas; requests to it will be met with a 404 response. The route can
    be re-enabled with the command-line argument --deprecated_features=json_schemas_resource.
  • 1.8.x - /v2/schemas is scheduled to be completely removed. If --deprecated_features=json_schemas_resource is
    still specified, Marathon will refuse to launch, with an error.

/v2/events

The default response format of the /v2/events is marked as deprecated and will be switched to the /v2/events?plan-format=light in the first 1.7.x release. The following deprecation schedule is planned for this endpoint:

  • 1.6.x - /v2/events will continue to function as normal
  • 1.7.x - The default /v2/events format will be switched to "light". You will still have the ability to use the command-line argument --deprecated_features=api_heavy_events to re-enable the heavy event response.
  • 1.8.x - The /v2/events format will be permanently switched to "light". If --deprecated_features=api_heavy_events is still specified, Marathon will refuse to launch, with an error.

Deprecation Details

The "lightweight" plan format can be already seen using the ?plan-format=light argument. In summary, this format drops the following fields from the deployment-related events in the event stream accessed via /v2/events:

  • plan.original - The current state of the root group
  • plan.target - The target state of the root group

Fixed Issues

  • MARATHON-7568 - We now redact any Zookeeper credentials from the /v2/info response endpoint.
  • Updated version of Marathon UI to 1.3.1:
    • MARATHON-8255 - Marathon UI properly shows fetch URLs in the edit dialog, now.
  • MARATHON-8124 Fix issue in which reservations lacking a persistent volume would not be destroyed.
  • MARATHON-7940 Fix connection-pool overflow issues with Marathon HTTP health checks by disabling connection pooling for them.
  • MARATHON-8136 Fix issues involving headers and URI filtering with Marathon HTTP healthchecks.
  • MARATHON-8083 Fix issue with datadog / graphite metric reporters in which several parameters were ignored.
  • MARATHON-8110 Fix issue in which Marathon would fail to accept offers for some resources from newer versions of Mesos.
  • MARATHON-2683 Deployments for run-specs with multiple health-checks now wait for all health checks to succeed.
  • MARATHON-8148 Pod last-failure-reason is now exposed via the API, as is done for apps.
  • MARATHON-8216 Fix Mesos HTTP health checks for non-host networking mode with containerPort=0 now work.
  • MARATHON-8064 Fix migration issue when store caching is disabled
  • MARATHON-8159 Fix migration issue which introduced erroneous taskKillGracePeriodSeconds values
  • MARATHON-8304 Fix rare bug in which Marathon would become unresponsive while connecting to Mesos.
  • MARATHON-7568 Zookeeper credentials are now redacted from logs and the /v2/info response.
  • MARATHON-7390 Fix issue in which Marathon would become unresponsive for a long time if Zookeeper DNS cannot be resolved at launch.
  • MARATHON-8084 Fix issue in which POST /v2/apps/{app_id}/restart would not proxy properly.
  • MARATHON-8326 Pod instances with persistent volumes can now be destroyed.
  • MARATHON-8095 Fix issue in which PATCH HTTP requests were not properly proxied.
  • Fix an issue in which resident tasks sometimes wouldn't be restarted.