Developing the operator

This page provides information how to develop changes for the operator code. Please read all sections before diving in.

Needed environment and tools

The operator is developed in Go, as such you need a current Go toolkit (for Linux most distributions provide packages, otherwise get it from the go homepage). Please use version 1.22.1 as this version is used by the CI pipelines. You also need an editor/IDE, ideally with Go support. We recommend either VS Code with the official Go extension or GoLand.

Additional tools you will need:

Make is used for running builds and tests
A local kubernetes cluster for testing: If you have Docker Desktop you can use its included kubernetes, otherwise we recommend k3d
Optional: kubebuilder: Kubebuilder is the framework we use for the operator. You will normally not need to use the kubebuilder CLI directly unless you are adding new CRDs or making major changes to the project structure

How does the operator work

The opensearch operator follows the normal kubernetes operator model. It is controlled by several Custom Resources (CR), defined in Kubernetes using a Custom Resource Definition (CRD), that act as the API/interface for the operator. This approach integrates the operator into the normal declarative model and thinking of the Kubernetes API, making it easy to use and automate.

The main resource offered by the operator is called OpenSearchCluster and is a spec for an opensearch cluster to be deployed in kubernetes. Additional resources are OpensearchRole, OpensearchUser and OpensearchUserRoleBinding which expose a declarative API to managing users and roles in OpenSearch.

For each custom resource the operator runs a controller. This controller connects to the kubernetes API and watches for changes in custom objects belonging to its custom resource (for example a newly created OpenSearchCluster object). For each object a reconcile loop is started and the main Reconcile method is called. This main method acts as an orchestrator and calls a number of reconcilers. Each reconciler is responsible for one aspect of managing an opensearch cluster (for example there is one that deals with deploying the dashboards instance and one dealing with the management of the securityconfig). The reconcile run is repeated regularly (called requeing) so that the reconcilers can react to any changes to the cluster. In case the custom object is changed a new reconcile run is triggered immediately.

Code structure

The basic structure of the code looks like follows:

charts: Contains the helm chart to deploy the operator to kubernetes
docs: Contains userguide and developer docs
opensearch-operator: Contains the operator sourcecode
- api: Contains the structs that define the Custom Resources the operator is offering
- config: Kubernetes YAMLs (e.g. CRDs) generated by kubebuilder based on the code and configuration
- controllers: Top level controllers for the operator. There is one controller per CRD. The controllers act as orchestrators and delegate the actual work to the reconcilers
- examples: Example cluster specs
- opensearch-gateway: Code for an opensearch client the operator uses to interact with the opensearch API
- pkg: The bulk of the code
  - builders: Code to construct the kubernetes objects that make up the actual opensearch clusters the operator is managing
  - helpers: Helper code used by the other packages
  - reconcilers: Each reconciler deals with one specific aspect of a cluster
  - tls: Code specific for certificate management
- Dockerfile: Multi-Architecture Dockerfile for the operator, use with docker buildx
  - The docker base image used is gcr.io/distroless/static:nonroot from distroless which do not contain package managers. To downloads the package source files for operator docker image, click the following link.
- main.go: The entrypoint for the operator code. Initializes the runtime and starts the actual controllers
- Makefile: The makefile that contains commands/targets helping with developing the operator

Getting started

Before starting to implement, please discuss with the project what you want to implement. If there is already an issue for the change please add a comment so that everybody knows you are working on it. If you abandon the work or cannot complete it presently please also comment as such so that others can take the issue. In case you want to implement a feature that does not have an open issue please create one beforehand and discuss your idea and get feedback from the Maintainers to make sure your change has a good chance to be accepted.

For big or complex new features please create a design document first (add a file to the docs/designs folder and submit a PR for it). This makes sure all parties agree to the basic architecture and approach. The designs also serve as preserved documentation why features where implemented a certain way.

To start implementing you first need to determine where your change needs to happen. A good entrypoint is the reconciler that deals with the aspect/topic your change belongs to.

Extending/Changing the CRDs

Some features require extending the interface of the operator, which means extending the CRD. To do that edit the structs that define the CRD (in the api folder). The following rules apply:

All changes must be backwards compatible, you can never remove a field and changing an existing field is only possible in narrow circumstances
If possible keep new fields as optional and fall back to a sensible default in the code if needed

Writing tests

Unittests are an integral part of our development process. They give us the confidence that a feature has been implemented correctly and also function as regression tests to make sure we do not inadvertently break existing functionality or reintroduce bugs.

Every change you make must be backed by a unittest. Even if it is only a very simple test that mainly acts as a regression test. If you fix a bug, create a unittest that checks this specific bug.

In Go tests sit alongside the normal code in separate files suffixed _test.go. Our policy is to have a test file for each implementation file (e.g. the configuration reconciler in configuration.go has a corresponding test file configuration_test.go).

For writing tests we use the ginkgo and gomega libraries to make structuring tests and checking assertions easier. Additionally we use mockery to automatically generate mocks (interfaces for which mocks should be generated must be configured in .mockery.yaml).

We use a mixture of unit tests (testing functions in isolation) and integration tests (testing a part of the system and its interaction). For the integration tests we use envtest to provide a kubernetes control plane API. Note that this does not provide a fully functional kubernetes cluster, only the API (so for example if you create a statefulset, no pods will actually be created). Envtest makes it easier to test the interaction between components and kubernetes without having to mock the entire kubernetes API. Ideally each big feature or reconciler should have one integration test to check overall functionality and a number of unit tests for specifics and logic edge cases.

To run the test suite use make test from a terminal. It can take a minute or more due to the mix of unit and integration tests.

Please check out the existing tests to get an idea how to structure and write them.

Running the operator locally

To test your changes you can launch the operator locally. You need a running kubernetes cluster with the current kubectl context pointed to it.

Navigate into the opensearch-operator directory
Run make build manifests to build the controller binary and the manifests
Run make install to create the CRD in the kubernetes cluster
Start the Operator by running make run
In a separate terminal apply a OpenSearchCluster YAML (you can use one of the examples as a starting point, for example kubectl apply -f examples/opensearch-cluster.yaml)
In the end you can delete your cluster again by running kubectl delete -f examples/opensearch-cluster.yaml

By default the operator produces logs in a JSON format. For easier reading during debugging you can switch the logging framework into a special development mode that switches off JSON and produces more details (stacktraces on warnings). Simply set the environment variable OPERATOR_DEV_LOGGING=true before running. E.g. to run locally with make: OPERATOR_DEV_LOGGING=true make run. If you want to enable this mode for a deployed operator use the manager.extraEnv helm chart values option to set the environment variable.

Note that for some features the operator expects to be able to communicate directly with opensearch. This is not possible when the operator is running outside of kubernetes. In these cases you will need to deploy the operator to test it. Follow these steps:

Run make docker-build to build the docker image
If needed import the image into your cluster (for k3d run k3d image import controller:latest)
Deploy the operator with helm by running helm install opensearch-operator ../charts/opensearch-operator --set manager.image.repository=controller --set manager.image.tag=latest --set manager.image.pullPolicy=IfNotPresent
Apply your OpenSearchCluster YAML

To deploy a new version simply rebuild and reimport the docker image and restart the controller (for example by deleting the running pod).

Submitting a PR

Once you are ready to share your work, please fork the repository into your github account, create and push a feature branch, then open a PR.

The PR description must contain the following:

A short description what this PR does
Links to any issues (bugs or feature requests) that the PR deals with (write it as Fixes #XYZ so that github automatically links and closes the issue once the PR is merged)
For new features an explanation on how the feature was implemented
Special circumstances for testing the change

All PRs must conform to the following rules:

All commits must be signed to acknowledge the DCO (see CONTRIBUTING.md), use git commit -s to sign it. Note: Edits via the Github Web UI will not be signed. Please also do not confuse this with GPG-signing commits
All code must be formatted with gofmt (many IDEs do this automatically on each save)
There must not be any linter warnings (check locally by running make lint)
There must be a unittest for the new/changed functionality and all unit tests must be successful
If you make changes to the CRD the CRD YAMLs must be updated (via make manifests) and also copied into the helm chart:
```
cp opensearch-operator/config/crd/bases/opensearch.opster.io_*.yaml charts/opensearch-operator/files/
```
Changes to the CRD must be documented in the CRD reference
Any customer-visible features must be documented in the userguide
No TODOs or commented out code snippets can be in the code

After you have submitted the PR one of the Maintainers will perform a code review and provide feedback. Please make sure to respond quickly to any questions or requested changes. Once the maintainer is satisfied, they will approve and merge the PR.

If you want early feedback on your code, e.g. to validate that your approach is a good one, you can open a Draft PR. Please state in the description what you are attempting to do and what you want feedback on.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

developing.md

developing.md

Developing the operator

Needed environment and tools

How does the operator work

Code structure

Getting started

Extending/Changing the CRDs

Writing tests

Running the operator locally

Submitting a PR

Files

developing.md

Latest commit

History

developing.md

File metadata and controls

Developing the operator

Needed environment and tools

How does the operator work

Code structure

Getting started

Extending/Changing the CRDs

Writing tests

Running the operator locally

Submitting a PR