Skip to content
Costi Muraru edited this page Feb 5, 2019 · 4 revisions

Ops CLI

We use multiple tools to manage our infrastructure at Adobe. The purpose of ops-cli is to gather the common cluster configurations in a single place and, based on these, interact with the above mentioned tools. In this way, we can avoid duplication and can quickly spin up new clusters (either production or development ones). All we need to do is customize the cluster configuration file (example here).

For the past couple of years we've been using the ops-cli, in order to manage the infrastructure (AWS/Azure) across multiple Adobe projects. ops-cli (https://github.com/adobe/ops-cli) is a python wrapper for Terraform, Ansible and SSH for cloud automation. It also integrates with the AWS cli, in order to provide inventory, ssh, sync, tunnel and the possibility to run ansible playbooks on top of EC2 instances. It can be used to add a layer of templating (using jinja2) on top of Terraform files. This is useful for removing duplicated code when it comes to spinning up infrastructure across multiple environments (stage/sandbox/prod) and across teams. Useful for both AWS and Kubernetes deployments, given that Terraform has support for Amazon Elastic Kubernetes (EKS).

Below you can find a few use cases that we use it for.

Use cases inside Adobe

1. Create AWS/Azure infrastructure with ops-cli and Terraform

Terraform (https://www.terraform.io) is an open source tool from HashiCorp which allows us to create and maintain infrastructure as code. It's a powerful tool, which we use to spin up infrastructure (VPCs, servers, subnets, load balancers, queues etc.) in AWS and Azure. One drawback that we've faced from the early beginnings though, was the lack of templating. We quickly found ourselves unable to re-use common pieces of infrastructure code, across multiple environments and projects. Here is where ops comes into play. ops-cli allows us to templatize the terraform modules using Jinja2 (http://jinja.pocoo.org/docs/2.10). It also allows us to define a cluster specification (using a yaml file), which points to common terraform modules. In this way, we can create multiple clusters and re-use the terraform modules.

2. Manage servers (AWS / Azure)

The ops-cli can be used to manage a fleet of servers (which don't have to necessarily be created via Terraform). It can do an inventory (by making calls to AWS/Azure to retrieve the list of servers) and list these instances in a pretty format. It also allows us to:

  • connect to a given server (using SSH)
  • copy files to/from servers
  • open a tunnel to a server (in case we want to access a service that is running on that specific server, on localhost)
  • run an ansible playbook on the servers

It is especially useful when these servers don't have a public IP. Instead, if there is bastion in front of them (as the single point of contact), ops-cli can leverage it in order to SSH to those instance, which have only a private IP.

3. Application deployments

You can use ops to perform rolling deployments of (web) apps. By using an ansible playbook, we were able to take 5% of the nodes at a time, and run remove from load balancer, service upgrade and service restart on each batch. This allowed us to upgrade our app on all nodes and restart them with a rolling window. Right now we're moving towards using Spinnaker for app deployment.

4. Create Kubernetes clusters with ops-cli and Terraform

In order to quickly spin up Kubernetes clusters (in a repeatable and automated fashion), we turned again to the ops-cli. Terraform supports deploying a Kubernetes cluster in AWS (via what's called an Amazon Elastic Kubernetes service). We are using ops-cli to perform templating of this AWS EKS terraform module, so that we can re-use it, which allows us to deploy multiple Kubernetes clusters (across different regions/environments).

Once the Kubernetes cluster is up and running, we want to install some common packages before deploying our own apps. These include: cluster-autoscaler, logging (eg. fluentd), metrics (eg. prometheus), tracing (eg. New Relic), continous deployment (eg. Spinnaker) and so forth. Luckily, these are all already available, packaged as Helm charts (https://github.com/helm/charts/tree/master/stable).

If only we could again use terraform to deploy Helm charts inside our newly created Kubernetes cluster. It turns out that we can! There's a terraform Helm provider available (https://github.com/terraform-providers/terraform-provider-helm), which does just that: install Helm charts via Terraform. Again, the ops-cli was handy in order to minimize code duplication when deploying these common helm packages via terraform.

We invite you to navigate through our fully working example, of deploy a Kubernetes cluster in AWS using ops-cli + terraform + helm: https://github.com/adobe/ops-cli/tree/master/examples/aws-kubernetes