Skip to content

Deploy on Amazon Web Services

Anders Larsson edited this page Jul 11, 2018 · 22 revisions

This tutorial provides an overview of the steps involved in setting up a CRE on AWS using command-line.

Note: Please follow Starting a PhenoMeNal CRE on a public or private cloud provider for the general prerequisites for a deployment on public or private cloud provider.

Amazon specific prerequisites

Configuration

All of the commands in this documentation are meant to be run in the config directory created by the command below.

Start by creating a configuration directory:

kn --preset phenomenal init aws my-vre-config-dir
cd my-vre-config-dir

Inside this configuration directory you will need to edit the file config.tfvars where you will need to set:

Cluster

  • cluster_prefix: every resource in your tenancy will be named with this prefix

  • aws_access_key_id: your access key id

  • aws_secret_access_key: your secret access key id

  • aws_region: the region where your cluster will be bootstrapped (e.g. eu-west-1)

  • availability_zone: an availability zone for your cluster (e.g. eu-west-1a)

Master configuration

  • master_instance_type: an instance flavor for the master
  • master_disk_size: Size in GB, default 20 is enough for normal deployments
  • master_as_edge: master is acting as gateway for accessing services

Node configuration

  • node_count: number of Kubernetes nodes to be created (no floating IP is needed for these nodes)
  • node_instance_type: an instance flavor name for the Kubernetes nodes
  • node_disk_size: Size in GB, default 20 is enough for normal deployments

Gluster configuration - See: KubeNow Gluster documentation.

  • glusternode_count: number of egde nodes to be created (1 - 3 depending on preferred replication factor)
  • glusternode_instance_type: an instance flavor for the glusternodes
  • glusternode_disk_size: Size in GB, default 20 is enough for normal deployments
  • glusternode_extra_disk_size: disk size of the fileserver disk in GB (depending on the size of your dataset)

Edge configuration (optional) - See: KubeNow Edge documentation.

  • edge_count: number of egde nodes to be created
  • edge_instance_type: an instance flavor for the edge nodes
  • edge_disk_size: Size in GB, default 20 is enough for normal deployments

Cloudflare (optional) - See: KubeNow Cloudflare documentation.

  • use_cloudflare: wether you want to use cloudflare as dns provider
  • cloudflare_email: the mail that you used to register your Cloudflare account
  • cloudflare_token: an authentication token that you can generate from the Cloudflare web interface
  • cloudflare_domain: a zone that you created in your Cloudflare account. This typically matches your domain name (e.g. somedomain.com)
  • cloudflare_subdomain: "a subdomain for this deployment"

Cloudflare proxy (optional) - See: KubeNow Cloudflare proxy documentation.

  • cloudflare_proxied:: to proxy or not, e.g. true
  • cloudflare_record_texts: name on services to be proxied

In the provision sub-section of the config.tfvars config file you can edit the following parameters: (Also see KubeNow Provisioning documentation.)

Services

  • password_all_services: password for all your services (e.g. Galaxy, Jupyter etc.)
  • username_all_services: username for all your services (e.g. Galaxy, Jupyter etc.)

Galaxy

  • galaxy_include: should service be deployed in cluster (true/false)
  • galaxy_admin_email: the local galaxy admin (you?)

Jupyter

  • jupyter_include: should service be deployed in cluster (true/false)

Luigi

  • luigi_include: should service be deployed in cluster (true/false)

Kubernetes dashboard

  • dashboard_include: should service be deployed in cluster (true/false)

Logging and monitoring services - See: Logging and monitoring wiki

  • logmon_include: should service be deployed in cluster (true/false)

Pachyderm + Minio (optional) - See: Pachyderm tutorial with MTBLS data

  • pachyderm_release_name: a release name for the Pachyderm service
  • pachyderm_etcd_pvc_size: storage dedicated for etcd (In GB)
  • minio_release_name: release name for the Minio service
  • minio_pvc_size: storage dedicated for the Minio service (In GB)
  • minio_accesskey: access key for the S3 endpoint
  • minio_secretkey: secret key for the S3 endpoint
  • minio_replicas: number of replicas of the Minio service

Once you are done with your settings you are ready to deploy the cluster:

kn apply

when deployment is finished then you should be able to reach the services at:

Galaxy         = http://galaxy.<your-prefix>.<yourdomain>
Jupyter        = http://notebook.<your-prefix>.<yourdomain>
Luigi          = http://luigi.<your-prefix>.<yourdomain>
Kube-dashboard = http://dashboard.<your-prefix>.<yourdomain>
Pachyderm      = ssh into the master node and use pachctl. Pachyderm tutorial: https://github.com/phnmnl/MTBLS233-Pachyderm

and if you want to ssh into the master node:

kn ssh

and to destroy use:

kn destroy

PhenoMeNal help and support

For feedback and help

Clone this wiki locally