Guidance for Game Server Hosting on Amazon EKS with Agones and Open Match

Introduction
Pre-requisites
Create the clusters and deploy the required components
Build and deploy the game server fleets
Integrate Open Match with Agones
Clean Up Resources
Security recommendations

Introduction

This guidance provides code and instructions to create a multi Kubernetes cluster environment to host a match making and game server solution, integrating Open Match, Agones and Amazon Elastic Kubernetes Service (Amazon EKS), for a session-based multiplayer game.

This README aims to provide a succint installation guide for the integration. For more technical information about the deployment process, customization, and troubleshooting, please see the Details page.

High Level Architecture

Repository organization

.
├── README.md   # This file
├── DETAILS.md  # Extra documentation
├── integration # Golang code and Dockerfiles for Open Match - Agones integration
│   ├── clients 
│   │   ├── allocation-client
│   │   ├── ncat
│   │   └── stk
│   ├── director 
│   ├── matchfunction
│   └── ncat-server
├── manifests   # Kubernetes YAML manifests
│   └── fleets
│       ├── ncat
│       └── stk
├── scripts     # Shell scripts
└── terraform   # Terraform code
    ├── cluster
    ├── extra-cluster
    └── intra-cluster
        └── helm_values

Cluster bootstrapping

The Terraform scripts create the clusters using Amazon EKS Blueprints for Terraform. Agones and Open Match are deployed when the clusters are bootstrapped.

Certificates CA and key files required for TLS communictions are generated by cert-manager, enabled in the Terraform definition as an add-on of the EKS blueprints.

The EKS Blueprints enables metrics and logging for the EKS clusters. Metrics are exported to CloudWatch to provide observability on the clusters.

Terraform also deploys resources out of the clusters needed by the integration, such as inter-region VPC Peering, Global Accelerator and Elastic Container Registry (ECR) repositories.

Open Match - Agones integration

Golang code provides integration between Open Match and Agones, creating

a Match Making Function based on the latency from the client to the server endpoint
a Director that handles the game server allocation between Agones Allocator and the game clients.

Dockerfile and Kubernetes manifests enable container building and deploying.

Pre-requisites

This guidance assumes the user already has access to an AWS account and has the AWS command line interface installed and configured to access the account using their credentials. While the commands and scripts here were tested on bash and zsh shells, they can be run with some modifications in other shells, like Windows PowerShell or fish.

To deploy the infrastructure and run the examples, we need:

Create the clusters and deploy the required components

For our example, we will be creating two EKS clusters in different regions. The first one will run our Open Match matchmaking service and the central Agones allocator, with a group of game servers for this region. The second cluster will run another Agones instance, responsible for the game servers on that region. We will run terraform in three steps, following the terraform folder:

terraform/cluster: Creates the EKS clusters and VPCs in both regions.
terraform/intra-cluster: Deploys several components inside both clusters, using helm charts and kubernetes manifests.
terraform/extra-cluster: Creates additional AWS resources outside the cluster, like ECR repositories, VPC peering and Global Accelerator infrastructures.

Prepare terraform environment variables

Define the names of our clusters and two different regions to run them. We can customize the clusters names, regions and VPC CIDR using the variables passed to the Terraform stack. In our examples we will be using agones-gameservers-1 and 10.1.0.0/16 on region us-east-1, and agones-gameservers-2 with 10.2.0.0/16 region us-east-2. Note that the CIDR of the VPCs should not overlap, since we will use VPC Peering to connect them.

CLUSTER1=agones-gameservers-1
REGION1=us-east-1
CIDR1="10.1.0.0/16"
CLUSTER2=agones-gameservers-2
REGION2=us-east-2
CIDR2="10.2.0.0/16"
VERSION="1.28"

For simplicity, we will be using local Terraform state files. In production workloads, we recommend storing the state files remotely, for example using the S3 Terraform backend.

terraform/cluster

Run the following commands to create EKS clusters, with the names and regions configured in the previous steps.

# Initialize Terraform
terraform -chdir=terraform/cluster init &&
# Create both clusters
terraform -chdir=terraform/cluster apply -auto-approve \
 -var="cluster_1_name=${CLUSTER1}" \
 -var="cluster_1_region=${REGION1}" \
 -var="cluster_1_cidr=${CIDR1}" \
 -var="cluster_2_name=${CLUSTER2}" \
 -var="cluster_2_region=${REGION2}" \
 -var="cluster_2_cidr=${CIDR2}" \
 -var="cluster_version=${VERSION}"

terraform/intra-cluster

The commands below will deploy our resources inside the clusters created in the last step. We use the output values from terraform/cluster as input to the terraform/intra-cluster module.

# Initialize Terraform
terraform -chdir=terraform/intra-cluster init  &&
# Deploy to the first cluster
terraform -chdir=terraform/intra-cluster workspace select -or-create=true ${REGION1} &&
terraform -chdir=terraform/intra-cluster apply -auto-approve \
 -var="cluster_name=${CLUSTER1}" \
 -var="cluster_region=${REGION1}" \
 -var="cluster_endpoint=$(terraform -chdir=terraform/cluster output -raw cluster_1_endpoint)" \
 -var="cluster_certificate_authority_data=$(terraform -chdir=terraform/cluster output -raw cluster_1_certificate_authority_data)" \
 -var="cluster_token=$(terraform -chdir=terraform/cluster output -raw cluster_1_token)" \
 -var="cluster_version=${VERSION}" \
 -var="oidc_provider_arn=$(terraform -chdir=terraform/cluster output -raw oidc_provider_1_arn)" \
 -var="namespaces=[\"agones-openmatch\", \"agones-system\", \"gameservers\", \"open-match\"]" \
 -var="configure_agones=true" \
 -var="configure_open_match=true" &&
# Deploy to the second cluster
terraform -chdir=terraform/intra-cluster workspace select -or-create=true ${REGION2} &&
terraform -chdir=terraform/intra-cluster apply -auto-approve \
 -var="cluster_name=${CLUSTER2}" \
 -var="cluster_region=${REGION2}" \
 -var="cluster_endpoint=$(terraform -chdir=terraform/cluster output -raw cluster_2_endpoint)" \
 -var="cluster_certificate_authority_data=$(terraform -chdir=terraform/cluster output -raw cluster_2_certificate_authority_data)" \
 -var="cluster_token=$(terraform -chdir=terraform/cluster output -raw cluster_2_token)" \
 -var="cluster_version=${VERSION}" \
 -var="oidc_provider_arn=$(terraform -chdir=terraform/cluster output -raw oidc_provider_2_arn)" \
 -var="namespaces=[\"agones-system\", \"gameservers\"]" \
 -var="configure_agones=true" \
 -var="configure_open_match=false"

terraform/extra-cluster

Here we deploy the external components to our infrastructure, and configure resources that need both clusters to deploy, such as VPC Peering and Agones multi-cluster allocation.

# Initialize Terraform
terraform -chdir=terraform/extra-cluster init &&
# Get the values needed by Terraform
VPC1=$(terraform -chdir=terraform/cluster output -raw vpc_1_id) &&
SUBNETS1=$(terraform -chdir=terraform/cluster output gameservers_1_subnets) &&
ROUTE1=$(terraform -chdir=terraform/cluster output -raw private_route_table_1_id) &&
ENDPOINT1=$(terraform -chdir=terraform/cluster output -raw cluster_2_endpoint) &&
AUTH1=$(terraform -chdir=terraform/cluster output -raw cluster_1_certificate_authority_data) &&
TOKEN1=$(terraform -chdir=terraform/cluster output -raw cluster_1_token) &&
VPC2=$(terraform -chdir=terraform/cluster output -raw vpc_2_id) &&
SUBNETS2=$(terraform -chdir=terraform/cluster output gameservers_2_subnets) &&
ROUTE2=$(terraform -chdir=terraform/cluster output -raw private_route_table_2_id) &&
ENDPOINT2=$(terraform -chdir=terraform/cluster output -raw cluster_2_endpoint) &&
AUTH2=$(terraform -chdir=terraform/cluster output -raw cluster_2_certificate_authority_data) &&
TOKEN2=$(terraform -chdir=terraform/cluster output -raw cluster_2_token) &&
# Create resources  
terraform -chdir=terraform/extra-cluster apply -auto-approve \
 -var="cluster_1_name=${CLUSTER1}" \
 -var="requester_cidr=${CIDR1}" \
 -var="requester_vpc_id=${VPC1}" \
 -var="requester_route=${ROUTE1}" \
 -var="cluster_1_gameservers_subnets=${SUBNETS1}" \
 -var="cluster_1_endpoint=${ENDPOINT1}" \
 -var="cluster_1_certificate_authority_data=${AUTH1}" \
 -var="cluster_1_token=${TOKEN1}" \
 -var="cluster_2_name=${CLUSTER2}" \
 -var="accepter_cidr=${CIDR2}" \
 -var="accepter_vpc_id=${VPC2}" \
 -var="accepter_route=${ROUTE2}" \
 -var="cluster_2_gameservers_subnets=${SUBNETS2}" \
 -var="cluster_2_endpoint=${ENDPOINT2}" \
 -var="cluster_2_certificate_authority_data=${AUTH2}" \
 -var="cluster_2_token=${TOKEN2}" \
 -var="cluster_1_region=${REGION1}" \
 -var="ecr_region=${REGION1}" \
 -var="cluster_2_region=${REGION2}"

After several minutes, Terraform should end with a mesage similar to this:

Apply complete! Resources: XX added, YY changed, ZZ destroyed.

Outputs:
global_accelerator_address = "abcdefgh123456789.awsglobalaccelerator.com"

Please, save the global_accelerator_address value, as we will use it later to connect to our game servers. In case we need to retrieve it, we can run terraform -chdir=terraform/extra-cluster output.

Build and deploy the game server fleets

We added two game servers to test the Agones and Open Match deployments:

ncat-server: a lightweight client-server chatroom we developed using Ncat together with a Golang client to illustrate the Open Match integration.
SuperTuxKart: a 3D open-source kart racing game developed in C/C++. Since we didn't change the client's code to integrate Open Match functionality, we use a Golang wrapper with the code from the ncat example.

We will use the ncat-server deployment to test the Open Match matchmaking.

Note: Verify that Docker is running before the next steps.

Use the command below to build the image, push it to the ECR repository, and deploy 4 fleets of ncat game servers on each cluster.

sh scripts/deploy-ncat-fleets.sh ${CLUSTER1} ${REGION1} ${CLUSTER2} ${REGION2}

Integrate Open Match with Agones

This repository contains code and documentation for the customized versions of Open Match director and matchfunction on the folders ./integration/director/ and ./integration/matchfunction/, as well as the client tools we used in the folder ./integration/clients/.

Deploy Match Making Function and Director on the first cluster

Switch the kubernetes context to `$CLUSTER1``

kubectl config use-context $(kubectl config get-contexts -o=name | grep ${CLUSTER1})

Build and deploy the Open Match matchmaking function

sh scripts/deploy-matchfunction.sh ${CLUSTER1} ${REGION1}

Build and deploy the Open Match Director

sh scripts/deploy-director.sh ${CLUSTER1} ${REGION1} ${REGION2}

Verify that the mmf and director pods are running

kubectl get pods -n agones-openmatch

Test the ncat server

Here we test the flow of the Open Match - Agones integration. We use the ncat fleet deployment and the contents of the folder integration/clients/ncat. We will need to open several terminal windows to run this test. Note about the client <-> frontend communication: In our example, we connect the client directly to the Open Match Frontend, so we need to have the TLS certificates from the frontend available to the client. In a complete game architecture, we would have a Game Frontend between the client and the Open Match, that would handle this communication, among other tasks like authentication, leaderboards, and lobby.

Go to the integration/clients/ncat

cd integration/clients/ncat

Get the TLS cert of the Frontend

kubectl get secret open-match-tls-server -n open-match -o jsonpath="{.data.public\.cert}" | base64 -d > public.cert
kubectl get secret open-match-tls-server -n open-match -o jsonpath="{.data.private\.key}" | base64 -d > private.key
kubectl get secret open-match-tls-rootca -n open-match -o jsonpath="{.data.public\.cert}" | base64 -d > publicCA.cert

Run the player client. Here we'll use the value of global_accelerator_address from the Terraform deployment. Remember to adjust our regions:

REGION1=us-east-1
REGION2=us-east-2
export GOPROXY=direct
go run main.go -frontend <global_accelerator_address>:50504 -region1 $REGION1 -latencyRegion1 10 -region2 $REGION2 -latencyRegion2 30

YYYY/MM/DD hh:mm:ss Connecting to Open Match Frontend YYYY/MM/DD hh:mm:ss Ticket ID: cdfu6mqgqm6kj18qr880 YYYY/MM/DD hh:mm:ss Waiting for ticket assignment

In three other terminal windows, type the commands from the steps (1.) and (3.) above. When the fourth client is started, we should have a similar output to the sample below, showing the connection to the Frontend server, the game server assigned to the client and the connection to the game server:

YYYY/MM/DD hh:mm:ss Connecting to Open Match Frontend YYYY/MM/DD hh:mm:ss Ticket ID: cdfu6mqgqm6kj18qr880 YYYY/MM/DD hh:mm:ss Waiting for ticket assignment
YYYY/MM/DD hh:mm:ss Ticket assignment: connection:"xxxxxxxxxxxxxxxxx.awsglobalaccelerator.com:yyyyy" YYYY/MM/DD hh:mm:ss Disconnecting from Open Match Frontend YYYY/MM/DD hh:mm:ss Connecting to ncat server 201.17.120.226 is connected as . already connected: nobody. 201.17.120.226 is connected as . already connected: 201.17.120.226 as . 201.17.120.226 is connected as . already connected: 201.17.120.226 as , 201.17.120.226 as . 201.17.120.226 is connected as . already connected: 201.17.120.226 as , 201.17.120.226 as , 201.17.120.226 as .

In another terminal window, verify the game servers. Let the command running with the -w flag to detect state changes. We should see one server in the Allocated state, and the others in the Ready state.

kubectl get gs -n gameservers -w

In the terminal windows running the clients, type anything and press enter. We should see the messages replicated to the other client windows.
Press CTRL-C in all the client windows. This should close the clients. When the last one closes, switch to the window with the kubectl get gs -w command. It should show that the allocated server is shutting down (since all the players disconnected) and a new game server is being provisioned.
We can repeat the process with different values to the -latencyRegion1 and -latencyRegion2 flags when calling the client, to verify how it affects the game server allocation. Remember to stop the client with CTRL-C, adjust the kubectl context to the cluster with the lowest latency, using the command

kubectl config use-context $(kubectl config get-contexts -o=name | grep ${CLUSTER1})

or

kubectl config use-context $(kubectl config get-contexts -o=name | grep ${CLUSTER2})

and run

kubectl get gs -n gameservers -w

again, before starting the clients with new values.

Test with SuperTuxKart

We can use the fleets in the fleets/stk/ folder and the client in integration/clients/stk/ to test the SuperTuxKart integration with Open Match and Agones, similarly to our ncat example above. Please, refer to the README.md in the stk folder for more instructions.

Clean Up Resources

Destroy the extra clusters components

terraform -chdir=terraform/extra-cluster destroy -auto-approve \
 -var="requester_cidr=${CIDR1}" \
 -var="requester_vpc_id=${VPC1}" \
 -var="requester_route=${ROUTE1}" \
 -var="cluster_1_name=${CLUSTER1}" \
 -var="cluster_1_gameservers_subnets=${SUBNETS1}" \
 -var="cluster_1_endpoint=${ENDPOINT1}" \
 -var="cluster_1_certificate_authority_data=${AUTH1}" \
 -var="cluster_1_token=${TOKEN1}" \
 -var="cluster_2_name=${CLUSTER2}" \
 -var="accepter_cidr=${CIDR2}" \
 -var="accepter_vpc_id=${VPC2}" \
 -var="accepter_route=${ROUTE2}" \
 -var="cluster_2_gameservers_subnets=${SUBNETS2}" \
 -var="cluster_2_endpoint=${ENDPOINT2}" \
 -var="cluster_2_certificate_authority_data=${AUTH2}" \
 -var="cluster_2_token=${TOKEN2}" \
 -var="cluster_1_region=${REGION1}" \
 -var="ecr_region=${REGION1}" \
 -var="cluster_2_region=${REGION2}"

Delete the Load Balancers and Security Groups

aws elbv2 delete-load-balancer --region ${REGION1} --load-balancer-arn $(aws elbv2 describe-load-balancers --region ${REGION1} --query "LoadBalancers[?contains(LoadBalancerName,'${CLUSTER1}-om-fe')].LoadBalancerArn"  --output text)
aws elbv2 delete-load-balancer --region ${REGION1} --load-balancer-arn $(aws elbv2 describe-load-balancers --region ${REGION1} --query "LoadBalancers[?contains(LoadBalancerName,'${CLUSTER1}-allocator')].LoadBalancerArn"  --output text)
aws elbv2 delete-load-balancer --region ${REGION1} --load-balancer-arn $(aws elbv2 describe-load-balancers --region ${REGION1} --query "LoadBalancers[?contains(LoadBalancerName,'${CLUSTER1}-ping-http')].LoadBalancerArn"  --output text)
aws elbv2 delete-load-balancer --region ${REGION1} --load-balancer-arn $(aws elbv2 describe-load-balancers --region ${REGION1} --query "LoadBalancers[?contains(LoadBalancerName,'${CLUSTER1}-ping-udp')].LoadBalancerArn"  --output text)
aws elbv2 delete-load-balancer --region ${REGION2} --load-balancer-arn $(aws elbv2 describe-load-balancers --region ${REGION2} --query "LoadBalancers[?contains(LoadBalancerName,'${CLUSTER2}-allocator')].LoadBalancerArn"  --output text)
aws elbv2 delete-load-balancer --region ${REGION2} --load-balancer-arn $(aws elbv2 describe-load-balancers --region ${REGION2} --query "LoadBalancers[?contains(LoadBalancerName,'${CLUSTER2}-ping-http')].LoadBalancerArn"  --output text)
aws elbv2 delete-load-balancer --region ${REGION2} --load-balancer-arn $(aws elbv2 describe-load-balancers --region ${REGION2} --query "LoadBalancers[?contains(LoadBalancerName,'${CLUSTER2}-ping-udp')].LoadBalancerArn"  --output text)

Discard or destroy the internal cluster components If we are removing all the components of the solution, it's quicker to simply discard the Terraform state of the intra-cluster folder, since we will destroy the clusters in the next step, and this will automatically remove the intra-cluster components.

terraform -chdir=terraform/intra-cluster workspace select ${REGION1}
terraform -chdir=terraform/intra-cluster state list | cut -f 1 -d '[' | xargs -L 0 terraform -chdir=terraform/intra-cluster state rm
terraform -chdir=terraform/intra-cluster workspace select ${REGION2}
terraform -chdir=terraform/intra-cluster state list | cut -f 1 -d '[' | xargs -L 0 terraform -chdir=terraform/intra-cluster state rm

If we prefer to destroy the components in this stage (for example, to keep the clusters created by terraform/cluster and test terraform-intra clusters with other values and configurations), use the code below instead.

# Destroy the resources inside the first cluster
terraform -chdir=terraform/intra-cluster workspace select ${REGION1}
terraform -chdir=terraform/intra-cluster destroy -auto-approve \
-var="cluster_name=${CLUSTER1}" \
-var="cluster_region=${REGION1}" \
-var="cluster_endpoint=$(terraform -chdir=terraform/cluster output -raw cluster_1_endpoint)" \
-var="cluster_certificate_authority_data=$(terraform -chdir=terraform/cluster output -raw cluster_1_certificate_authority_data)" \
-var="cluster_token=$(terraform -chdir=terraform/cluster output -raw cluster_1_token)" \
-var="cluster_version=${VERSION}" \
-var="oidc_provider_arn=$(terraform -chdir=terraform/cluster output -raw oidc_provider_1_arn)" \
-var="namespaces=[\"agones-openmatch\", \"agones-system\", \"gameservers\", \"open-match\"]" \
-var="configure_agones=true" \
-var="configure_open_match=true"

# Destroy the resources inside the second cluster
terraform -chdir=terraform/cluster workspace select ${REGION2}
terraform -chdir=terraform/intra-cluster workspace select ${REGION2}
terraform -chdir=terraform/intra-cluster destroy -auto-approve \
-var="cluster_name=${CLUSTER2}" \
-var="cluster_region=${REGION2}" \
-var="cluster_endpoint=$(terraform -chdir=terraform/cluster output -raw cluster_2_endpoint)" \
-var="cluster_certificate_authority_data=$(terraform -chdir=terraform/cluster output -raw cluster_2_certificate_authority_data)" \
-var="cluster_token=$(terraform -chdir=terraform/cluster output -raw cluster_2_token)" \
-var="cluster_version=${VERSION}" \
-var="oidc_provider_arn=$(terraform -chdir=terraform/cluster output -raw oidc_provider_2_arn)" \
-var="namespaces=[\"agones-system\", \"gameservers\"]" \
-var="configure_agones=true" \
-var="configure_open_match=false"

Destroy the clusters

# Destroy both clusters
terraform -chdir=terraform/cluster destroy -auto-approve \
-var="cluster_1_name=${CLUSTER1}" \
-var="cluster_1_region=${REGION1}" \
-var="cluster_1_cidr=${CIDR1}" \
-var="cluster_2_name=${CLUSTER2}" \
-var="cluster_2_region=${REGION2}" \
-var="cluster_2_cidr=${CIDR2}" \
-var="cluster_version=${VERSION}"

Note: if the terraform destroy command fails to destroy the subnets or the VPCs, run the commands

for sg in $(aws ec2 describe-security-groups --region ${REGION1} --filters "Name=vpc-id,Values=$(aws ec2  describe-vpcs --region ${REGION1} --filters "Name=tag:Name,Values='${CLUSTER1}'" --query Vpcs[].VpcId --output text)" --query SecurityGroups[].GroupId --output text); do aws ec2 delete-security-group --region ${REGION1} --group-id $sg ; done
for sg in $(aws ec2 describe-security-groups --region ${REGION2} --filters "Name=vpc-id,Values=$(aws ec2  describe-vpcs --region ${REGION2} --filters "Name=tag:Name,Values='${CLUSTER2}'" --query Vpcs[].VpcId --output text)" --query SecurityGroups[].GroupId --output text); do aws ec2 delete-security-group --region ${REGION2} --group-id $sg ; done

and run the terraform destroy command again.

Remove the clusters from kubectl config

kubectl config delete-context $(kubectl config get-contexts -o=name | grep ${CLUSTER1})
kubectl config delete-context $(kubectl config get-contexts -o=name | grep ${CLUSTER2})

Remove the local certificate files

rm -f *.crt *.key integration/clients/stk/*.cert integration/clients/stk/*.key integration/clients/ncat/*.cert integration/clients/ncat/*.key

Security recommendations

This page provides suggestions of actions that should be taken to make the solution more secure acording to AWS best practices.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
integration		integration
manifests		manifests
scripts		scripts
terraform		terraform
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DETAILS.md		DETAILS.md
LICENSE		LICENSE
README.md		README.md
agones-openmatch-flow.png		agones-openmatch-flow.png
agones-openmatch-multicluster.1.png		agones-openmatch-multicluster.1.png
agones-openmatch-multicluster.2.png		agones-openmatch-multicluster.2.png
agones-openmatch-multicluster.3.png		agones-openmatch-multicluster.3.png
agones-openmatch-multicluster.4.png		agones-openmatch-multicluster.4.png
agones-openmatch-multicluster.5.png		agones-openmatch-multicluster.5.png
agones-openmatch-multicluster.final.png		agones-openmatch-multicluster.final.png
respawn-multicluster.6.png		respawn-multicluster.6.png
security.md		security.md

License

aws-solutions-library-samples/guidance-for-game-server-hosting-using-agones-and-open-match-on-amazon-eks

Folders and files

Latest commit

History

Repository files navigation

Guidance for Game Server Hosting on Amazon EKS with Agones and Open Match

Introduction

High Level Architecture

Repository organization

Cluster bootstrapping

Open Match - Agones integration

Pre-requisites

Create the clusters and deploy the required components

Prepare terraform environment variables

terraform/cluster

terraform/intra-cluster

terraform/extra-cluster

Build and deploy the game server fleets

Integrate Open Match with Agones

Deploy Match Making Function and Director on the first cluster

Test the ncat server

Test with SuperTuxKart

Clean Up Resources

Security recommendations

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages