calculate http error_timeout based upon capacity option #145

jpittis · 2017-05-25T23:49:25Z

This PR proposes an intuitive way to configure :error_threshold for Semian HTTP configurations. The user configures a :capacity option as a percentage and the :error_threshold is calculated based upon a requests :read_timeout.

Reasoning

The following diagrams assumes the circuit starts open and the requested endpoint is not recovering. This means the worker will alternate between the open and half open state.

t=0        t=1         t=2        t=3         t=4        t=5
 |----------|-----------|----------|-----------|----------|
open       half        open       half        open       half

free       busy        free       busy        free       busy

Whenever the circuit is in an open state, the worker is able to do work for other resources. But when the worker is in a half open state, the worker cannot do other work because it's stuck hanging until the request times out.

We're calling this ratio of free to busy state the worker's :capacity.

The High School Math

For Semian HTTP requests we can calculate capacity based on this equation:

capacity = error_timeout / (error_timeout + request_timeout)

In words, capacity of a given worker is the amount of time that is not spent hanging on a single request.

Examples

A :capacity of 0.5 would set the :error_timeout state to whatever the request timeout is.
As :capacity approaches infinity, :error_timeout also approaches infinity.
With a :capacity of 0.75 and a 60 second request timeout, the :error_timeout would be 180 seconds.

Isn't this capacity stuff meant to be handled by bulkheads?

This PR addresses the capacity of a lone worker and doesn't require shared state between workers.
Bulkheads require a semaphore per resource which is expensive when dealing with a large number of resources. (For example a large number of HTTP requests.)

Concerns

This idea should be verified by a number of trained experts in high school math.
Just because the idea makes sense does not mean it's worth adding to Semian.
Do we care that the default :read_timeout being 60 seconds will lead to values of :error_timeout greater than a minute when :capacity > 0.5?

jpittis force-pushed the http-capacity branch from 8b7a552 to 8c1c14f Compare May 25, 2017 23:51

calculate http error_timeout based upon capacity option

9bed593

jpittis force-pushed the http-capacity branch from 8c1c14f to 9bed593 Compare May 26, 2017 15:52

sirupsen mentioned this pull request Jul 30, 2018

circuit: introduce half_open_resource_timeout #188

Merged

miry force-pushed the master branch from 98d8601 to 57d2e0d Compare February 8, 2023 10:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

calculate http error_timeout based upon capacity option #145

calculate http error_timeout based upon capacity option #145

jpittis commented May 25, 2017 •

edited

calculate http error_timeout based upon capacity option #145

Are you sure you want to change the base?

calculate http error_timeout based upon capacity option #145

Conversation

jpittis commented May 25, 2017 • edited

Reasoning

The High School Math

Examples

Isn't this capacity stuff meant to be handled by bulkheads?

Concerns

jpittis commented May 25, 2017 •

edited