Using GCE to benchmark Scylla
This document will guide us through setting up a benchmark for a Scylla cluster hosted on Google Cloud Engine (GCE).
Creating a cluster to benchmark Scylla starts with a single instance, which we spin up through the gcloud
CLI tool (or alternatively, through GCE's web interface). We'll create, in the current gcloud
project and zone, a n1-standard-4
instance with a local NVME disk, which is a reliable way to ensure good performance, and a small CentOS boot disk:
gcloud compute instances create `whoami`-scylla-test-1 --machine-type=n1-standard-4 --boot-disk-auto-delete --boot-disk-size=20GB --boot-disk-type "pd-ssd" --local-ssd interface=NVME --image-project=centos-cloud --image-family=centos-7
Note that instances created with local disks cannot be stopped and then started again.
Next, we'll log into the instance, by writing gcloud compute ssh `whoami`-scylla-test-1
, in order to install the relevant Scylla packages. To do so,
we'll visit the download page to obtain a download link and install the repo for the desired version. Then, we can do:
sudo yum install scylla-server scylla-tools scylla-jmx
If we wish to install a custom built scylla-server rpm
package, we can do sudo yum --nogpgcheck localinstall scylla-server-666.development-xxx.el7.centos.x86_64.rpm
and leverage the unstable repo to provide the required dependencies.
Now we'll go through scylla_setup
, but we won't set up RAID and XFS, nor set up the IO configuration.
Next, we edit the seeds in /etc/scylla/scylla.yaml
to contain this node's private IP address:
sudo sed -i -- "/seeds/s/127.0.0.1/$(hostname -i)/g" /etc/scylla/scylla.yaml
So far we've performed the setup common to all nodes in our cluster, which means we can make a snapshot and speed up the setup of new instances or clusters:
gcloud compute disks snapshot `whoami`-scylla-test-1 --snapshot-names=`whoami`-centos-boot-disk --description="CentOS boot disk for a test Scylla cluster"
Note that when creating a new cluster from this snapshot, the seeds must be changed in all instances.
Now we'll add an additional two instances to form our 3 node cluster:
for i in {2..3}; do gcloud compute disks create `whoami`-scylla-test-$i --source-snapshot `whoami`-centos-boot-disk; done
for i in {2..3}; do gcloud compute instances create `whoami`-scylla-test-$i --machine-type=n1-standard-4 --boot-disk-auto-delete --disk name=`whoami`-scylla-test-$i,boot=yes --local-ssd interface=NVME; done
For all of them, we can now finish configuring what's left. First, we'll fill in the correct IP address in the scylla.yaml
file:
for i in {1..3}; do gcloud compute ssh `whoami`-scylla-test-$i --command='sudo sed -i -- "s/localhost/$(hostname -i)/g" /etc/scylla/scylla.yaml'; done
Then, we'll prepare the NVME drive to host Scylla's data directory:
for i in {1..3}; do gcloud compute ssh `whoami`-scylla-test-$i --command='sudo scylla_raid_setup --disks /dev/nvme0n1 --raiddev /dev/md0 --update-fstab --root /var/lib/scylla --volume-role all'; done
for i in {1..3}; do gcloud compute ssh `whoami`-scylla-test-$i --command='sudo scylla_io_setup'; done
We are now ready to bind these instances together in a cluster:
for i in {1..3}; do gcloud compute ssh `whoami`-scylla-test-$i --command='sudo systemctl restart scylla-server'; done
The status of the cluster can be observed through nodetool
:
gcloud compute ssh `whoami`-scylla-test-1 --command='nodetool status'
We now wish to set up monitoring for our cluster. We can create a small machine for that purpose:
gcloud compute instances create `whoami`-scylla-test-monitoring --machine-type=n1-standard-1 --boot-disk-auto-delete --boot-disk-size=10GB --boot-disk-type "pd-ssd" --image-project=centos-cloud --image-family=centos-7
To set it up, we'll follow the guide in the Scylla Grafana monitoring Github repository. In a nutshell, we'll execute the following commands inside the instance:
sudo yum install -y git docker python-pip
sudo pip install --upgrade pip
sudo pip install pyyaml
git clone https://github.com/scylladb/scylla-grafana-monitoring.git
cd scylla-grafana-monitoring
sudo systemctl restart docker
./genconfig.py -ns -d myconf $(gcloud compute instances list --filter="name~`whoami`-scylla-test-\d" | awk '{print $4}' | tail -n 3)
./start-all.sh -s myconf/scylla_servers.yml -n myconf/node_exporter_servers.yml
Note that if the docker service fails to start, it may be necessary to reboot the instance.
We can access the Grafana dashboard at port 3000 of the node's external IP address (which can be obtained through gcloud
by doing gcloud compute instances list --filter="name=`whoami`-scylla-test-monitoring" | awk '{print $5}' | tail -n 1
).
Setting up the loader is the easiest part of the process. We just need to spin up a new machine, to which we may want to give some more power:
gcloud compute disks create `whoami`-scylla-test-loader-1 --source-snapshot `whoami`-centos-boot-disk
gcloud compute instances create `whoami`-scylla-test-loader-1 --machine-type=n1-standard-8 --boot-disk-auto-delete --disk name=`whoami`-scylla-test-loader-1,boot=yes
Finally, we're ready to benchmark the cluster using cassandra-stress
. All we need is a valid command, and to specify the private IP address of one of its nodes. For example:
export node=$(gcloud compute instances list --filter="name~`whoami`-scylla-test-1" | awk '{print $4}' | tail -n 1)
gcloud compute ssh `whoami`-scylla-test-loader-1 --command="cassandra-stress write no-warmup cl=QUORUM n=30000000 -pop seq=1..30000000 -schema 'replication(factor=3)' -col 'size=FIXED(60) n=FIXED(5)' -mode cql3 native -rate threads=80 -node $node"
gcloud compute ssh `whoami`-scylla-test-loader-1 --command="cassandra-stress read no-warmup cl=QUORUM duration=15m -pop 'dist=gauss(1..30000000,10)' -col 'size=FIXED(60) n=FIXED(5)' -mode cql3 native -rate threads=200 -node $node"
More loaders can be added if needed, following the same procedure.
The following command deletes all of the instances:
gcloud compute instances delete `whoami`-scylla-test-1 `whoami`-scylla-test-2 `whoami`-scylla-test-3 `whoami`-scylla-test-loader-1 `whoami`-scylla-test-monitoring
To delete the snapshot, we execute:
gcloud compute snapshots delete `whoami`-centos-boot-disk