Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

set up ARAX self-hosted instance in us-east-1 #2129

Open
saramsey opened this issue Sep 7, 2023 · 39 comments
Open

set up ARAX self-hosted instance in us-east-1 #2129

saramsey opened this issue Sep 7, 2023 · 39 comments
Assignees

Comments

@saramsey
Copy link
Member

saramsey commented Sep 7, 2023

This is motivated by issue #2097.

@saramsey saramsey self-assigned this Sep 7, 2023
@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

I have provisioned a m6a.4xlarge instance in us-east-1 under the OSU COE AWS account. It has 1023 GiB of EBS storage allocated to the root volume. Using the AWS default Ubuntu 22.04 server AMI.

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

I've edited /etc/ssh/sshd_config (see Slack for explanation; it's not good security practice to disclose security policy changes in a public issue repository).

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

So far I have done:

sudo apt-get update
sudo apt-get install -y emacs
sudo cp /etc/ssh/sshd_config /etc/ssh/sshd_config.ori
sudo emacs -nw /etc/ssh/sshd_config

(see Slack #deployment channel for the diff of changes for sshd_config).

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

OK, I've installed ssh public keys for @edeutsch, @sundareswarpullela, @amykglen, @kvnthomas98, @isbluis, and me into the instance. Anyone else I should add?

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

also just done:

sudo apt-get install -y nginx certbot python3-certbot-nginx
sudo apt-get upgrade -y

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

Now installing Docker, following instructions from docker.com:

sudo apt-get update

(in the last step above, choosing to "keep the local /etc/ssh/sshd_config that is currently installed)

sudo apt-get install \
    ca-certificates \
    curl \
    gnupg \
    lsb-release

sudo mkdir -p /etc/apt/keyrings

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get install -y docker-ce \
                                      docker-ce-cli \
                                      dockercontainerd.io \
                                      dockerdocker-buildx-plugin \
                                      dockerdocker-compose-plugin

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

Checking docker installation:

ubuntu@ip-172-31-41-172:~$ sudo docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
719385e32844: Pull complete
Digest: sha256:dcba6daec718f547568c562956fa47e1b03673dd010fe6ee58ca806767031d1c
Status: Downloaded newer image for hello-world:latest

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

I generated an ECDSA public key on arax.ncats.io and installed it in /home/ubuntu/.ssh/authorized_keys on arax2.rtx.ai. This is needed for the next step (see below).

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

I copied the nginx config file from arax.ncats.io to arax2.rtx.ai by running the following command as user stephenr on arax.ncats.io:

scp -oStrictHostKeyChecking=no /etc/nginx/sites-enabled/default ubuntu@arax2.rtx.ai:arax2.rtx.ai

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

On arax2.rtx.ai, I edited /home/ubuntu/arax2.rtx.ai as follows:

ubuntu@ip-172-31-41-172:~$ diff arax2.rtx.ai.ori arax2.rtx.ai
8c8
<     server_name arax.ncats.io;
---
>     server_name arax2.rtx.ai;
14d13
<     ssl on;
16c15
<     server_name arax.ncats.io;
---
>     server_name arax2.rtx.ai;
19,20c18,19
<     ssl_certificate             /etc/letsencrypt/live/arax.ncats.io/fullchain.pem;
<     ssl_certificate_key         /etc/letsencrypt/live/arax.ncats.io/privkey.pem;
---
>     ssl_certificate             /etc/letsencrypt/live/arax2.rtx.ai/fullchain.pem;
>     ssl_certificate_key         /etc/letsencrypt/live/arax2.rtx.ai/privkey.pem;

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

Verifying that nginx is working via HTTP (and reachable from outside AWS):

10-197-142-163:~ sramsey$ curl http://arax2.rtx.ai
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

Installing the nginx config file on arax2.rtx.ai:

sudo systemctl stop nginx
sudo certbot certonly --nginx

I entered ramseyst@oregonstate.edu for the email address, and y to the first question, then n to the second question, then entered arax2.rtx.ai for the
domain name. Then verified that the cert files are in place and have correct permissions:

ubuntu@ip-172-31-41-172:~$ sudo ls -alh /etc/letsencrypt/archive/arax2.rtx.ai
total 28K
drwxr-xr-x 2 root root 4.0K Sep  7 20:36 .
drwx------ 3 root root 4.0K Sep  7 20:36 ..
-rw-r--r-- 1 root root 1.8K Sep  7 20:36 cert1.pem
-rw-r--r-- 1 root root 3.7K Sep  7 20:36 chain1.pem
-rw-r--r-- 1 root root 5.4K Sep  7 20:36 fullchain1.pem
-rw------- 1 root root 1.7K Sep  7 20:36 privkey1.pem

Setting up the dhparam.pem file for TLS:

screen
openssl dhparam -out /etc/nginx/dhparams.pem 2048

(then exit out of the screen session using C-x d).

And now setting up the nginx config:

sudo cp /home/ubuntu/arax2.rtx.ai /etc/nginx/sites-available
sudo ln -s /etc/nginx/sites-available/
sudo rm /etc/nginx/sites-available/default

@edeutsch
Copy link
Collaborator

edeutsch commented Sep 7, 2023

FWIW I install Docker on Ubuntu with:

sudo apt --yes install docker.io

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

FWIW I install Docker on Ubuntu with:

sudo apt --yes install docker.io

That seems... a lot simpler than what I did. I think you've told me this before and I just forgot. I am old. I will make a note of this!

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

OK, nginx is up and running on arax2.rtx.ai with SSL enabled:

sramsey-laptop:~ sramsey$ curl -i https://arax2.rtx.ai
HTTP/1.1 502 Bad Gateway
Server: nginx
Date: Thu, 07 Sep 2023 21:18:06 GMT
Content-Type: text/html
Content-Length: 150
Connection: keep-alive

<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx</center>
</body>
</html>

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

From here on out, I am loosely following the steps in this shell script: https://github.com/RTXteam/RTX/blob/master/DockerBuild/test-instance-scripts/build-test-arax-from-fresh-instance.sh#L43

But not exactly; here are some key differences:

  • since that script is out of date; for example, we download databases from arax-databases.rtx.ai instead of arax.ncats.io
  • python3.10 is standard on Ubuntu 22.04 so we are not using python3.7 anymore
  • naming the container rtx1 for consistency with arax.ncats.io, rather than the container name arax as used in the shell script.
  • we are specifying that container port 80 maps to host port 8080, just as it is on arax.ncats.io

On arax2.rtx.ai, cloning the RTX code repo:

cd /home/ubuntu && git clone https://github.com/RTXteam/RTX.git

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

Making databases directory on arax2.rtx.ai and setting correct ownership:

sudo mkdir -p /mnt/data/orangeboard/databases
sudo chown ubuntu.ubuntu /mnt/data/orangeboard/databases

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

On arax2.rtx.ai, as user ubuntu, making an ECDSA public key:

ssh-keygen -t ecdsa

and just hit return like five times. Copy the file /home/ubuntu/.ssh/id_ecdsa.pub off of the system (to my local Macbook Pro) so I can upload it to arax-databases.rtx.ai and araxconfig.rtx.ai and install it into the appropriate authorized_key files, which would be:

On araxconfig.rtx.ai:

/home/araxconfig/.ssh/authorized_keys

On arax-databases.rtx.ai:

/home/rtxconfig/.ssh/authorized_keys

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

Making sure that passwordless ssh from user ubuntu on arax2.rtx.ai to araxconfig@araxconfig.rtx.ai and rtxconfig@arax-databases.rtx.ai is properly working:

ubuntu@ip-172-31-41-172:~$ ssh -q -oStrictHostKeyChecking=no rtxconfig@arax-databases.rtx.ai exit
ubuntu@ip-172-31-41-172:~$ ssh -q -oStrictHostKeyChecking=no araxconfig@araxconfig.rtx.ai e

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

Copy the config_secrets.json file to arax2.rtx.ai inside the /home/ubuntu/RTX/code dir:

scp araxconfig@araxconfig.rtx.ai:config_secrets.json RTX/code

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

create an override to point to the local RTX-KG2 API:

echo "http://localhost:5008/api/rtxkg2/v1.4" > RTX/code/kg2_url_override.txt

@edeutsch
Copy link
Collaborator

edeutsch commented Sep 7, 2023

create an override to point to the local RTX-KG2 API:

echo "http://localhost:5008/api/rtxkg2/v1.2" > RTX/code/kg2_url_override.txt

We are at v1.4 now..

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

create an override to point to the local RTX-KG2 API:

echo "http://localhost:5008/api/rtxkg2/v1.2" > RTX/code/kg2_url_override.txt

We are at v1.4 now..

Thank you, good catch

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

Need to locally modify RTX/code/ARAX/ARAXQuery/ARAX_database_manager.py so that it can download files via rsync (see also #2098):

Screenshot 2023-09-07 at 2 43 25 PM

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

Download ARAX database files, in a screen session:

screen
sudo apt-get install -y python3.10-venv gcc python3-dev
python3 -m venv venv
venv/bin/pip3 install -r RTX/requirements.txt
venv/bin/python3 RTX/code/ARAX/ARAXQuery/ARAX_database_manager.py --mnt --skip-if-exists

then exit out of screen using C-x d

@saramsey
Copy link
Member Author

saramsey commented Sep 7, 2023

Verifying that certbot is set to auto-renew the SSL cert:

ubuntu@ip-172-31-41-172:~$ sudo cat /etc/cron.d/certbot
# /etc/cron.d/certbot: crontab entries for the certbot package
#
# Upstream recommends attempting renewal twice a day
#
# Eventually, this will be an opportunity to validate certificates
# haven't been revoked, etc.  Renewal will only occur if expiration
# is within 30 days.
#
# Important Note!  This cronjob will NOT be executed if you are
# running systemd as your init system.  If you are running systemd,
# the cronjob.timer function takes precedence over this cronjob.  For
# more details, see the systemd.timer manpage, or use systemctl show
# certbot.timer.
SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

0 */12 * * * root test -x /usr/bin/certbot -a \! -d /run/systemd/system && perl -e 'sleep int(rand(43200))' && certbot -q renew

@saramsey
Copy link
Member Author

saramsey commented Sep 8, 2023

copy the dockerfile to the CWD so we can modify it in-place

cp RTX/DockerBuild/Merged-Dockerfile .

@saramsey
Copy link
Member Author

saramsey commented Sep 8, 2023

he following are optional; they are just for convenience in testing and debugging inside the container:

echo <<EOF >> ./Merged-Dockerfile
RUN apt-get install -y netcat 
RUN apt-get install -y emacs
EOF

@saramsey
Copy link
Member Author

saramsey commented Sep 8, 2023

Create the container:

sudo docker create --name rtx1 --tty --publish "8080:80" \
    --mount type=bind,source="/mnt/data/orangeboard/databases",\
target="/mnt/data/orangeboard/databases" arax:1.0

Start the container:

sudo docker start rtx1

@saramsey
Copy link
Member Author

Instance arax2.rtx.ai is shut down for the time being, until I have time to work on it again.

@saramsey
Copy link
Member Author

We're pausing work on this issue while we work on #2114

saramsey added a commit that referenced this issue Oct 5, 2023
@saramsey
Copy link
Member Author

saramsey commented Oct 5, 2023

OK, the commit a3ba706 contains the code fixes to get the Merged-Dockerfile build of a combined ARAX and RTX-KG2 service working in Ubuntu 20.04 (inside the container) and using our new config_secrets.json mechanism.

@saramsey
Copy link
Member Author

saramsey commented Oct 5, 2023

ARAX and RTX-KG2 are now running inside the arax Docker container on the m6a.4xlarge EC2 instance arax2.rtx.ai in the us-east-1 AWS region. As on arax.ncats.io, the container binds host side TCP port 8080 to container-side TCP port 80. I tried to change the minimum amount of code necessary in order to get ARAX and RTX-KG2 working inside a Docker container with a Ubuntu 20.04 base image (host OS Ubuntu 22.04 AMI).

@edeutsch
Copy link
Collaborator

edeutsch commented Oct 5, 2023

we should make sure that ITRB CI is happy with the new changes

@saramsey
Copy link
Member Author

saramsey commented Oct 5, 2023

Uh oh. They don't use Merged-Dockerfile, do they? I thought I searched the code base for that.

@edeutsch
Copy link
Collaborator

edeutsch commented Oct 5, 2023

ah, I don't know. maybe they don't use Merged-Dockerfile and they use the other ones? Hopefully fine, but some careful checking would be good because I am about to push to TEST again and then request we go to PROD. So any breaking changes might be widely distributed soon.

@saramsey
Copy link
Member Author

saramsey commented Oct 5, 2023

I just checked, and it looks like for ARAX deployment, ITRB uses RTX/DockerBuild/Dockerfile:

docker.build(env.IMAGE_NAME, "--build-arg BUILD_BRANCH=$ENV_BUILD_BRANCH --no-cache ./DockerBuild/")

and for KG2 deployment, ITRB uses RTX/DockerBuild/KG2-Dockerfile:

docker.build(env.IMAGE_NAME, "--build-arg BUILD_BRANCH=$ENV_BUILD_BRANCH --no-cache -f ./DockerBuild/KG2-Dockerfile ./DockerBuild/")

so I think we're safe. The Dockerfile that I updated is Merged-Dockerfile which is completely separate.

I did add a new config file kg2-config.js to the RTX/DockerBuild directory but I'd be kind of surprised if the sudden appearance of a new Javascript config file in that directory broke the build.

@saramsey
Copy link
Member Author

saramsey commented Oct 5, 2023

@edeutsch do you want to check out arax2.rtx.ai to see if it is acceptable to move forward with? I just checked, and your deutsch@scooter SSH public key is installed under user ubuntu. I could use some help setting up the following devareas inside the container:

  • beta
  • kg2beta
  • test
  • kg2test
  • devED
  • devLM

@saramsey
Copy link
Member Author

saramsey commented Oct 5, 2023

That last commit 8372db4 was to fix a line of code that was trying to check if the script can log into the database host, but was referencing the old database host system (arax.ncats.io) instead of the new database host (arax-databases.rtx.ai).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants