Cloud Controller Background Jobs (Delayed Jobs)

Background Jobs in the Cloud Controller

Overview

Cloud Controller uses the DelayedJob framework (along with the DelayedJob Sequel adapter) to schedule work to be run asynchronously by background workers. Jobs are essentially serialized Ruby classes that have some task to perform that are enqueued into the delayed_jobs table within Cloud Controller's database (ccdb). The DelayedJob workers read from the delayed_jobs table and accept a job which is then deserialized and performed. Additionally, jobs can specify a queue that they belong in which can affect their priority and which workers choose to work on them.

Types of Workers

Local Workers

The cloud_controller_ng API jobs rely on "local" workers to handle core cf push workflows that require access to the same filesystem as the API -- such as package and droplet uploads. There are two local workers by default and these workers work off of a special queue that contains the BOSH instance index to ensure that the local workers only pick up work that was enqueued by the cloud_controller_ng API job that they're colocated with. The local workers primarily transfer files (e.g. app packages, droplets) from the Cloud Controller's filesystem to the remote blobstore.

Generic Workers

A deployment of Cloud Foundry will come with one or more Cloud Controller Worker VMs running instances of the cloud_controller_worker job. These workers are "generic" because they are responsible for all of the other types of background jobs. Some examples are: asynchronous resource deletion, periodic nightly cleanup jobs, asynchronous service binding, applying server-side application manifests, and more.

DelayedJob Gotchas

Jobs are Serialized Ruby Classes

The delayed_job framework serializes Ruby classes in to the database when enqueuing work. This means that whatever is picking up the work needs to be able to deserialize (or rehydrate) this jobs back into a functioning Ruby class. This means if the workers are running newer (or older) code that may be different from a what a job was enqueued with problems can arise. There are a couple of things we can do to mitigate these problems:

Don't memoize or cache complex classes in instances variables within jobs themselves -- especially not within their constructors. E.g.

class MyJob
  def initialize(logger, http_client)
    @logger = logger
    @http_client = http_client
  end

  def perform
    1 + 1
  end
end

The framework has trouble deserializing these because the objects that are stored in those instance variables don't exist in the Ruby interpreter that is rehydrating the job. Instead, have the job create new instances on the fly when it is executed.

class MyJob
  def initialize()
  end

  def logger
    @logger ||= Logger.new
  end

  def http_client
    @http_client ||= HttpClient.new
  end
  
  def perform
    1 + 1
  end
end

When renaming a class constant that might be referenced in a serialized job, leave the old name around as an alias. See this commit for an example of how ProcessModel and App referred to the same thing at one point.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly