Skip to content

Latest commit

 

History

History
72 lines (51 loc) · 2.55 KB

cloud.md

File metadata and controls

72 lines (51 loc) · 2.55 KB

Cloud Cluster Guide

FAQ | Troubleshooting | Glossary

Overview

This guide focuses on setting up a cloud Slurm cluster. With cloud, there are decisions that need to be made and certain considerations taken into account. This guide will cover them and their recommended solutions.

There are two deployment methods for cloud cluster management:

GCP Marketplace

This deployment method leverages GCP Marketplace to make setting up clusters a breeze without leaving your browser. While this method is simpler and less flexible, it is great for exploring what slurm-gcp is!

See the Marketplace Guide for setup instructions and more information.

Terraform

This deployment method leverages Terraform to deploy and manage cluster infrastructure. While this method can be more complex, it is a robust option. slurm-gcp provides terraform modules that enables you to create a Slurm cluster with ease.

See the slurm_cluster module for details.

If you are unfamiliar with terraform, then please checkout out the documentation and starter guide to get you familiar.

Quickstart Examples

See the test cluster example for an extensible and robust example. It can be configured to handle creation of all supporting resources (e.g. network, service accounts) or leave that to you. Slurm can be configured with partitions and nodesets as desired.

NOTE: It is recommended to use the slurm_cluster module in your own terraform project. It may be useful to copy and modify one of the provided examples.

Alternatively, see HPC Blueprints for HPC Toolkit examples.