Skip to content
nick n edited this page Apr 4, 2019 · 14 revisions

The Netdisco job queue is implemented using Perl’s MCE distribution and a table in the PostgreSQL database. It has been written to run on old PostgreSQL (8.4+) so does not use modern features such as publish/subscribe or JSON.

Nevertheless, it works well. For example we can fully discover, macsuck, and arpnip a network of about 300 devices (four core routers and the rest edge routers, running VRF with many duplicate identities), 1200 subnets, and 50k hosts, with DNS resolution, in about three minutes.

The queue and worker system is also engineered to support multiple backend daemons (sharing one central PostgreSQL database), each perhaps with different configurations (e.g. reachable devices).

Priority Items

The job queue has the notion of priority jobs, which are those:

  • submitted via the web interface, or

  • SNMP SET commands such as changing port state, name, device location, and so on.

Such jobs are always picked next and will be allocated to the next free worker. The set of commands deemed high priority is, of course, configurable.

Keen-witted readers will note that this implies jobs triggered by the web interface are not run by the web interface (they are queued and run by a backend daemon); this is not always obvious to new users or the non-technical.

Feeder Queue

In the database the admin table is used to hold jobs (of all states: queued, in progress, and completed). Forgive us for the quite awful name of this table. Jobs are picked from this table according to rules (shown below) and copied into an in-memory "feeder" queue managed by MCE. At this point they are marked as reserved in the database by the backend host and will not be picked by another backend daemon.

Jobs are then picked from MCE’s queue and assigned to a worker. When priority items are added to MCE’s queue, they will jump to the front. In this way, although priority items cannot pre-empt a running job, they will at least get the next available worker.

The MCE feeder queue is the same size as the total number of workers, so ensures that workers always have jobs to do without needing to go back to the database.

Selecting from the Database

The SQL to select from the database implements prioritisation as best we can, considering the legacy database schema and reliance on hints and heuristics. Surprisingly, it works well. This is in part due to the expire job which prunes entries from the database older than two weeks (by default) to keep searches performant, even on large networks.

The TastyJobs database query is responsible for picking jobs, up to the number of available workers. Some of these jobs may be duplicates, or already running by the time the backend attempts to reserve them; race conditions abound in this process.

Results are pruned to those jobs available to this backend (see skip hints), then re-sorted by apparent priority (see above), number of failures to connect (to prefer well-behaved devices), and devices advertising LLDP identifiers (to help deduplication of jobs), before being randomised and picked.

Skip Hints

A separate table, device_skip, tracks connection failures to devices and also imports any of the *_no configuration settings which apply on this backend node. This allows the SQL to prioritise well-behaving (healthy, working) devices and to skip over those configured to be unreachable.

Those devices being skipped due to connection failures will be retried once if you restart the backend daemon, or else every time a week passes since they were last marked to be skipped.

Action on Restart

When the backend daemon restarts it does two things:

  1. decrements skip count for devices with connection failures (to allow retry)

  2. restarts any jobs that were booked out or already running when this daemon stopped

The daemon’s manager process then enters a loop of looking for jobs in the database and booking them out, once per second.

Scheduling Jobs

The backend daemon will start a scheduler to automatically queue discover, arpnip, macsuck, nbtstat, and expire jobs periodically. This is all automatic and requires no configuration, but can be overridden to change the schedule, change the jobs, or disable the scheduler altogether.

The API is wide open and allows development of backend workers to do anything, and for them to be set up with a job schedule just like discover, etc.

Most jobs executed at the CLI using netdisco-do are run immediately, so make no use of the job queue or backend daemon. The exception is *walk jobs, which are queued, or if the user specifies the --enqueue option.

Implementation

The backend daemon has three core processes, plus as many worker processes as are configured (default: twice the number of CPU cores, as SNMP is network I/O bound not CPU bound). The core processes being:

  • a scheduler which inserts jobs to the database according to configuration;

  • a manager which picks jobs from SQL and puts them into the feeder queue;

  • a "chief worker" which makes sure workers are started and restarted.

Note that this is using process forking, not threads. Also note that worker processes only run one job, and then die and restart, as some of our upstream libraries have terrible memory leaks or mess about with global variables (looking at you, net-snmp). This brutal restarting of workers turns out not to be impactful, and has the added bonus of making logs easier to read due to worker process IDs changing for each job.

It is also worth noting that the same database table (admin) is used for jobs through their whole lifetime: queued, picked by a backend, running, and completed (whether successfully or not). We use a hint in the status field to show that a job is picked, by postfixing the status with the backend hostname. We fill the started field once the job is actually running.

Testing and Developing

You can run the backend daemon in the foreground, show all SQL calls, and limit it to a single worker process, like so:

ND2_SINGLE_WORKER=1 DBIC_TRACE=1 ~/bin/netdisco-backend-fg

Configuration Options

Almost anything you wish to tune or customise in this whole process has a setting available to your deployment.yml, such as:

  • number of parallel tasks (workers)

  • set of job types considered "high priority"

  • interval between polling database for jobs

  • time to wait between killing and respawning workers

  • number of failed connections after which a device is skipped

  • time after which skipped devices are retried

  • schedule for automatically queuing jobs (like cron)

  • whether to run the scheduler on this host

  • name of the backend host

  • set of devices accessible to this backend host

  • time after which a job is aborted (global default or per job type)

  • time after which entries in the database queue are purged

  • time after which jobs in the queue are assumed stale and can be duplicated