Skip to content

unfoldingWord/tx-enqueue-job

Repository files navigation

master: Build Status Coverage Status

develop: Build Status Coverage Status

tX-Enqueue-Job

This is part of tX translationConverter platform initiated by a commit to the tX (Translation Converter Service) at tx.org.

See here for a diagram of the overall flow of the tx (translationConverter) platform.

That is more up-to-date than the write-up of the previous platform here (which was too dependent on expensive AWS lambda functions).

tX modifications (from door43-enqueue-job)

Modified September 2018 by RJH to handle a different payload plus it listens on a different port so the two can run simultaneously.

See the Makefile for a list of environment variables which are looked for.

Requires:
    Linux
    Python 3.6 (or later)

To setup:
    python3 -m venv myVenv
    source vemyVenvnv/bin/activate
    make dependencies

Set environment variables:
    (see tXenqueue/Makefile for a list of expected and optional environment variables)
    e.g., export QUEUE_PREFIX="dev-"

To try Python code in Flask:
    make runFlask
    (then can post data to http://127.0.0.1:5000/
        if there is no redis instance running)

To run (using Flask and gunicorn and nginx, plus redis) in three docker containers:
    make composeEnqueueRedis
    (then send json payload data to http://127.0.0.1:8090/)

However, if Door43-Enqueue-Job is already running locally (with a local REDIS instance)
to run (using Flask and gunicorn and nginx, with redis) in two docker containers:
    make composeEnqueue
    (then send json payload data to http://127.0.0.1:8090/)

To build a docker image:
    (requires environment variable DOCKER_USERNAME to be set)
    make image

To push the image to docker hub:
    docker login -u <dockerUsername>
    make pushImage

Basically this small program collects the json payload from the tX (Translation Converter Service) which connects to the / URL.)

This enqueue process checks for various fields for simple validation of the payload, and then puts the job onto a (rq) queue (stored in redis) to be processed.

The Python code is run in Flask, which is then served by Green Unicorn (gunicorn) but with nginx facing the outside world.

Testing

Use make composeEnqueueRedis or make composeEnqueue as above. The tx_job_handler also needs to be running. Use a command like curl -v http://127.0.0.1:8090/ -d @<path-to>/payload.json --header "Content-Type: application/json" --header "X-Gitea-Event: push" to queue a job, and if successful, you should receive a JSON response.

Deployment

Travis-CI is hooked to from GitHub to automatically test commits to both the develop and master branches, and on success, to build containers (tagged with those branch names) that are pushed to DockerHub.

To fetch the container use something like:
    docker pull --all-tags unfoldingword/tx_enqueue_job
or
    docker pull unfoldingword/tx_enqueue_job:develop

To view downloaded images and their tags:
    docker images

To test the container use:
    docker run --env QUEUE_PREFIX="dev-" --env FLASK_ENV="development" --env REDIS_HOSTNAME=<redis_hostname> --net="host" --name dev-tx_enqueue_job --rm unfoldingword/tx_enqueue_job:develop

or alternatively also including the optional Graphite url:
    docker run --env QUEUE_PREFIX="dev-" --env FLASK_ENV="development" --env REDIS_HOSTNAME=<redis_hostname> --env GRAPHITE_HOSTNAME=<graphite_hostname> --net="host" --name dev-tx_enqueue_job --rm unfoldingword/tx_enqueue_job:develop

NOTE: --rm automatically removes the container from the docker daemon when it exits
            (it doesn't delete the pulled image from disk)

To run the container in production use with the desired values:
    docker run --env GRAPHITE_HOSTNAME=<graphite_hostname> --env REDIS_HOSTNAME=<redis_hostname> --net="host" --name tx_enqueue_job --detach --rm unfoldingword/tx_enqueue_job:master

Running containers can be viewed with (or append --all to see all containers):
    docker ps

You can connect to a shell inside the container with commands like:
	# Gives a shell on the running container -- Note: no bash shell available
	docker exec -it `docker inspect --format="{{.Id}}" door43_enqueue_job` sh
	docker exec -it `docker inspect --format="{{.Id}}" dev-door43_enqueue_job` sh

The container can be stopped with a command like:
    docker stop dev-tx_enqueue_job
or using the full container name:
    docker stop unfoldingword/tx_enqueue_job:develop

The production container will be deployed to the unfoldingWord AWS EC2 instance, where Watchtower will automatically check for, pull, and run updated containers.

Further processing

The next part in the tx workflow can be found in the tx-job-handler repo. The job handler contains webhook.py (see below) which is given jobs that have been removed from the queue and then processes them -- adding them back to a failed queue if they give an exception or time-out. Note that the queue name here in tx_enqueue_main.py must match the one in the job handler rq_settings.py.

The following is the initial (forked) README

Webhook job queue

The webhook job queue is designed to receive notifications from services and store the JSON as a dictionary in a python-rq redis queue. It is designed to work with a Webhook Relay that validates and relays webhooks from known services such as DockerHub, Docker registries, GitHub, and GitLab. However, this is not required and the receiver may listen directly to these services.

A worker must be spawned separately to read from the queue and perform tasks in response to the event. The worker must have a function named webhook.job.

Running

The job queue requires docker-compose and, in its simplest form, can be invoked with docker-compose up. By default, it will bind to localhost:8080 but allow clients from all IP addresses. This may appear odd, but on MacOS and Windows, traffic to the containers will appear as though it's coming from the gateway of the network created by Docker's linux virtualization.

In a production environment without the networking restrictions imposed by MacOS/Windows, you might elect to provide different defaults through the the shell environment. e.g.

ALLOWED_IPS=A.B.C.D LISTEN_IP=0.0.0.0 docker-compose up

where A.B.C.D is an IP address (or CIDR range) from which your webhooks will be sent.

A worker must be spawned to perform tasks by removing the notification data from the redis queue. The redis keystore is configured to listen only to clients on the localhost.

Example worker

To run jobs using the webhooks as input:

  1. Create a file named webhook.py
  2. Define a function within named job that takes a dict as its lone argument
  3. Install python-rq
    • e.g. pip install rq
  4. Run rq worker from within that directory

See the CVMFS-to-Docker converter for a real world example.