Skip to content

Operations & deployment info

Amy Glen edited this page Sep 26, 2023 · 9 revisions

Scope

This is a draft operations page for the ARAX system. It is not complete. We are working on filling it with instructions and procedures.

SOPs

How to check if any services in the docker container have crashed

  1. ssh into the arax server: ssh <user>@<arax server name>
  2. get into the docker container: sudo docker exec -ti rtx1 bash
  3. look at which services are running: service --status-all this should return a list that looks similar to the following:
 [ + ]  RTX_Complete
 [ + ]  RTX_OpenAPI_beta
 [ + ]  RTX_OpenAPI_devED
 [ + ]  RTX_OpenAPI_devLM
 [ - ]  RTX_OpenAPI_dili
 [ + ]  RTX_OpenAPI_kg2
 [ - ]  RTX_OpenAPI_legacy
 [ - ]  RTX_OpenAPI_mvp
 [ - ]  RTX_OpenAPI_production
 [ + ]  RTX_OpenAPI_test
 [ + ]  apache-htcacheclean
 [ + ]  apache2
 [ - ]  apparmor
 [ - ]  bootmisc.sh
 [ - ]  checkfs.sh
 [ - ]  checkroot-bootclean.sh
 [ - ]  checkroot.sh
 [ - ]  cron
 [ - ]  dbus
 [ - ]  hostname.sh
 [ ? ]  hwclock.sh
 [ - ]  killprocs
 [ - ]  mountall-bootclean.sh
 [ - ]  mountall.sh
 [ - ]  mountdevsubfs.sh
 [ - ]  mountkernfs.sh
 [ - ]  mountnfs-bootclean.sh
 [ - ]  mountnfs.sh
 [ + ]  mysql
 [ + ]  neo4j
 [ ? ]  networking
 [ - ]  nginx
 [ ? ]  ondemand
 [ - ]  procps
 [ - ]  rc.local
 [ - ]  rsync
 [ - ]  sendsigs
 [ - ]  umountfs
 [ - ]  umountnfs.sh
 [ - ]  umountroot
 [ - ]  unattended-upgrades
 [ - ]  urandom
 [ - ]  x11-common
  1. the services that need to be running for production are apache2, mysql, apache-htcacheclean, RTX_Complete, and RTX_OpenAPI_production.
  2. In this case RTX_OpenAPI_production is not running to start again run service RTX_OpenAPI_production start to start it again. This should print the following if all goes well:
 * Starting system RTX_OpenAPI_production daemon                         [ OK ]

But what if the whole container has gone down?

  1. Check the list of containers: sudo docker ps -a

  2. (a) If the container rtx1 is running but is not responding restart it with sudo docker restart rtx1

    (b) Otherwise, if it is stopped start it with sudo docker start rtx1

  3. get into the docker container: sudo docker exec -ti rtx1 bash

  4. Start all of the commonly used services:

service apache2 start
service apache-htcacheclean start
service mysql start
service RTX_Complete start
service RTX_OpenAPI_production start
service RTX_OpenAPI_kg2 start
service RTX_OpenAPI_beta start
service RTX_OpenAPI_kg2beta start
service RTX_OpenAPI_test start
service RTX_OpenAPI_devED start
service RTX_OpenAPI_devLM start
  1. Wait a few seconds and double check that it is running at arax.ncats.io

What to do when NCATS restarts the arax.ncats.io instance

  1. establish a remote terminal session in the instance: ssh username@arax.ncats.io; you have to know what your Linux username on arax.ncats.io is, and it may not be the one you use on your home institution systems or dev system. The rest of the steps below assume you are running commands in the bash shell in the host OS on arax.ncats.io.
  2. start the rtx1 Docker container: sudo docker start rtx1
  3. start mysql inside the container: sudo docker exec rtx1 service mysql start
  4. start the "autocomplete" service inside the container: sudo docker exec -it rtx1 service RTX_Complete start
  5. start the RTX-KG2 API service inside the container: sudo docker exec rtx1 service RTX_OpenAPI_kg2 start
  6. (for any other KG2 API endpoints like kg2NewFmt, do the same as above but substituting the other endpoint name, i.e., kg2NewFmt instead of kg2)
  7. start the production ARAX API inside the container: sudo docker exec rtx1 service RTX_OpenAPI_production start
  8. (for any other ARAX API endpoints like "beta" or "devED", do the same as above but substituting the other endpoint name instead of "production")
  • devED
  • test
  • beta
  • devLM
  • NewFmt
  1. start apache2 inside the container: sudo docker exec rtx1 service apache2 start
  2. point your browser at https://arax.ncats.io and run a test query. Also test out the autocompleter.

How to fix arax.ncats.io when it's hanging

Log into the arax.ncats.io instance:

ssh username@arax.ncats.io

Enter the rtx1 Docker container:

sudo docker exec -ti rtx1 bash

Kill all python processes (this causes all RTX services to stop working correctly since they run python):

killall python3

Then to restart, run:

service RTX_OpenAPI_production start
service RTX_OpenAPI_devED start
service RTX_OpenAPI_kg2 start
service RTX_Complete start
service RTX_OpenAPI_test start
service RTX_OpenAPI_beta start
service RTX_OpenAPI_devLM start
service RTX_OpenAPI_kg2NewFmt start
service RTX_OpenAPI_NewFmt start

Note that the last two services are only relevant during the interim period where we are transitioning between TRAPI versions, and thus have separate ARAX and RTX-KG2 endpoints for the previous TRAPI version (1.1) and the new TRAPI version (1.2).

How to deploy code changes on arax.ncats.io

Deploying changes to the endpoint /foo on arax.ncats.io (e.g., /kg2beta) which is running branch currentbranch (e.g., master) involves (approximately) the following steps:

ssh arax.ncats.io
sudo docker exec -it rtx1 bash
su - rt
cd /mnt/data/orangeboard/foo/RTX
git status

check that the only modifications to tracked files are in openapi.yaml and then do:

git pull origin currentbranch
exit
service RTX_OpenAPI_foo restart
tail -f /tmp/RTX_OpenAPI_foo.elog

If you need to switch the branch that the endpoint /foo is on, say from currentbranch to otherbranch, the above steps would instead look something like this:

git pull origin currentbranch
git checkout otherbranch
git pull origin otherbranch
exit
service RTX_OpenAPI_foo restart
tail -f /tmp/RTX_OpenAPI_foo.elog

Rolling out a new KG2 version

This process essentially consists of building a new KG2c and other downstream databases off of this new KG2 version, organizing the necessary build artifacts on arax.ncats.io, uploading them to ITRB's SFTP server, and making any necessary code changes to ensure ARAX is compatible with the new KG2 version.

See this Github issue template for steps to roll-out a new KG2 version. You can create a new issue from this template at: https://github.com/RTXteam/RTX/issues/new?template=kg2rollout.md.

Nginx

On arax.ncats.io, we use Nginx as a TLS endpoint which proxies unencrypted HTTP requests to port 8080 on the host OS. We currently set the number of worker_connections to 10000.