Skip to content

Galaxy CloudMan

Geoff Nelson edited this page Jul 3, 2019 · 79 revisions

Deployment

Creating a new cluster using Cloud Launch v0.2:

  1. Login to be able to use the new Cloud Launch UI: https://launch.usegalaxy.org/login (You'll have to register for an account if you haven't already)
  2. Navigate to: https://launch.usegalaxy.org/catalog/appliance/galaxy-cloud
  3. Which version of this appliance would you like to launch?: 17.05
  4. On which cloud would you like to launch your appliance?: Amazon US East 1 - N. Virginia
  5. What type of credentials do you want to use?: Temporary Credentials
    • Input your Refinery-Platform AWS Account Access Key & Secret Access Key
  6. Click Next >
  7. Underneath Provide a name for your deployment specify a new name (This will be used to name EC2 instances used by the cluster)
  8. Provide a password to be able to access the CloudMan web interface and the cluster via ssh underneath: What Password would you like to use for this CloudMan instance?
  9. How Large should the Storage Volume be?: 600
  10. Click the toggle to show Advanced CloudMan options
  11. Change the default: SLURM cluster with Galaxy to Do not set cluster type now
  12. Shared Cluster String: cm-28d680029604c47e24c3a123ca3164aa/shared/2019-01-15--17-54
  13. What type of virtual hardware would you like to use?: c4.xlarge
  14. Click the toggle to show Advanced cloud launch options
  15. Which keypair would you like to use for this Virtual Machine?: Choose your desired keypair
  16. In which placement zone would you like to launch this appliance? Select the availability zone of the Refinery EC2 instance
  17. Click the checkbox for EBS Optimized
  18. Click Launch
  19. On the next screen wait until the Status column shows: RUNNING and click the URL underneath the Access address column to access your cluster
  20. In the new window or tab, log in with the username: ubuntu and the password you have chosen previously (step 8)
  21. Add your cluster name (from step 7) to the newly created Galaxy EBS volume via Volumes on the AWS EC2 Dashboard (this makes it a lot easier to identify them later)
  22. It will take a few minutes for all the applications to get setup and once that is done, a message popup will inform you that the cluster is ready for use
  23. Configure Auto-Scaling:
    • In the CloudMan console you should see Autoscaling is off. Turn on? Click on the link from the "on" in "Turn on"
    • Minimum number of worker nodes to maintain: 0
    • Maximum number of worker nodes to maintain 2
    • Type of Nodes(s) Custom Instance Type
    • Enter a desired instance type: m4.2xlarge
    • Click Turn autoscaling on
  24. Navigate to the CloudMan admin console for your cluster (<cluster IP>/cloud/admin) and underneath System controls click Switch master to not run jobs. This will stop the master Node from being an execution host, and will allow for a worker node to be spun up automatically when a job is received.
  25. Update Route 53 CNAME record of the Galaxy instance ({dev,test,prod}-galaxy.aws.stemcellcommons.org) with the hostname of the CloudMan master node (via AWS Route 53 Dashboard)
  26. Update refinery instance to communicate to this cluster.

Launching an existing cluster using Cloud Launch v0.2:

Notes

  • Do not launch a second instance of a cluster if it is already running (might lead to unpredictable results)
  • If you are cloning an existing shared cluster, the storage blocks on the Galaxy FS volume are restored from a snapshot and must be initialized (pulled down from Amazon S3 and written to the volume). This can cause a significant IO performance drop when you start using the cluster for the first time. You can use the dd utility to read from all of the blocks on a volume: sudo dd if=/dev/xvdf of=/dev/null bs=1M where /dev/xvdf should be the Galaxy FS volume. This process could take a few hours to complete.
  • Do not restart clusters using "Reboot the cluster" option in the Admin console. This results in strange behavior. Shut down the cluster and then bring it up using Cloud Launch instead.

Termination

In the CloudLaunch V2 interface:

  1. Click on the trash can icon next to your instance
  2. Click on Delete Permanently and Archive

The EC2 instances will be terminated but data will remain to allow relaunching (unless you select the option to delete the cluster).

Upgrading a cluster:

  • When introducing new Galaxy Tools/Workflows, editing existing ones a cluster upgrade is warranted and a new shared cluster string will have to be created.
  • Please refer to the CloudMan Upgrade notes here

Troubleshooting:

  • It may be necessary to use an instance type for the master node with at least 4 CPUs for toolshed installations to work properly.

Reference

  1. Cloning a shared cluster
  2. Start your cluster
  3. Scott's 17.05 cluster Notes
Clone this wiki locally