Skip to content

Getting started

Ulf Wiger edited this page Jun 16, 2013 · 3 revisions

An obvious problem with load regulation is that its importance tends to manifest itself late in the project - once you've written most of your software and are ready to subject the mostly-finished system to greater loads. Even worse, your customers may be the ones making you aware of the issue!

For technical details on how to get started, check our the Examples

Here are some guidelines for getting started with load regulation early.

Identify your job types

As a rule, all significant jobs your be regulated. If you only regulate the high-priority work, chances are that you will see priority inversion during high load - as the high-priority work is slowed by the load regulator, while the low-priority work proceeds unhindered.

Start by naming your job categories, and creating a queue for each. If you have difficulty deciding on an appropriate rate for each queue, you can choose to either:

  • Set a deliberately low rate, well below what you think you can handle. Remember that different types of jobs combine to drive up the load, and 'pile-ups' may well generate much nastier load problems than the sum of the simultaneously active work suggests. It's good to have a safety margin.

  • Set the queue to {action, approve} to begin with. This is the gutsy option, where no regulation at all takes place. Be prepared to tune it later in your project.

Regulate as early as possible

Regulation should happen as close to the source of the traffic as possible. If your load comes in e.g. via a web server, a good place to regulate is right after you've parsed the request and identified the type of work required. It's better to reject requests before you've invested system capacity in it, than to accept them and then fail to deliver a reply.

Regulate only once

Ideally, you want to commit to a task once, and then carry it through with minimal obstruction. That is, don't impose load regulation on core components, such as e.g. database libraries or complex calculations - such components usually lack sufficient information about context to make suitable regulation decisions. They do, however, need to behave well in terms of general real-time requirements, but - since Jobs is an Erlang-based component - following basic Erlang design principles should take care of this.

Sometimes, there are exceptions. One example that comes to mind is authentication of web requests. If a session has already been established, authentication can be near-instantaneous, but if a password needs to be verified, you should be using something like bcrypt to ensure that your stored password hashes, if stolen, can't be too easily cracked. This will be a heavy operation, and the cost of load regulation will be insignificant by comparison. In this case, I'd use a special queue for 'login', allow a very conservative rate, and treat it as a separate job.

Regulate between nodes (?)

This may or may not be necessary, depending on how much of the distributed workload you can factor into the original regulation decision. If your work starts on one node, and another node is later selected for further work, it is probably wise to allow that node to reject such requests if it has to. Remember that resources have already been invested in the job, so an internal request from a neighboring node should get higher priority than a fresh request from the outside.

When to reject

This may differ depending on your domain, but in general there are rejectable and non-rejectable requests. For example, a request to view a video clip would be rejectable, whereas a request to stop viewing it is non-rejectable (at least, it doesn't make sense to reject it, as it likely costs more to do so - and keep showing the clip - than to serve the request). Basically a resource allocation should be rejectable, whereas freeing a resource usually isn't.

Jobs provides a few different ways to reject requests. A queue can be given either a maximum time limit, a maximum length, or both. It is also possible to define a queue as {action, reject}, which leads to blanket rejection of every request.