Skip to content

A Knative Eventing custom source to stream events from Open Telekom Cloud CloudTrace service and load them as graph in a Neo4j database.

License

Notifications You must be signed in to change notification settings

akyriako/cloudtrace-exporter

Repository files navigation

cloudtrace-exporter

A custom exporter that collects traces from Open Telekom Cloud CloudTrace service and loads them as graph in a Neo4j database.

Cloud Trace Service (CTS) is an effective monitoring tool that allows users to analyze their cloud resources using traces. A tracker is automatically generated when the service is started and monitors access to all the respective user’s cloud resources using the generated traces. The monitoring logs can be saved long-term and cost-effectively in the Object Storage Service (OBS). The CTS can also be used in conjunction with Simple Message Notification (SMN), allowing the user to receive a message when certain events occur.

This custom exporter is taking a different route. It's utilizing Knative Eventing to create a custom source (cts_exporter) that collects traces from CTS and forwards them, as Cloud Events to an agnostic sink, defined by an environment variable called K_SINK, as is required by Knative Eventing specifications for interconnecting microservices. In addition to cts_exporter, a custom sink (neo4j_sink) that listens for those Cloud Events is provided, which loads these events in a Neo4j database as graphs. You could positively bind the cts_exporter to any other sink that conforms to Knative specifications. You can find an example in the repo that uses gcr.io/knative-releases/knative.dev/eventing/cmd/event_display as a target sink. That is a demo Knative Eventing Service that simply logs the events in the os.Stdout.

graph.png

Neo4j is a highly acclaimed graph database management system developed by Neo4j, Inc. Unlike traditional relational databases that store data in tables, Neo4j is designed around the concept of storing and managing data as nodes and relationships. This structure is particularly well-suited for handling complex and interconnected data, making it easier to model, store, and query relationships directly.

Graph databases like Neo4j are based on graph theory and use graph structures with nodes, edges, and properties to represent and store data. In this context:

  • Nodes represent entities (such as subjects, actions, resources, tenants & regions in the context of CloudTrace domain).
  • Relationships provide directed, named connections between nodes. These relationships can also have properties that provide more context about the connection (such as who performed an action, on which resource this action was performed, in which tenant is this resource is member, in which region is this tenant located)
  • Properties are key-value pairs attached to nodes and relationships, allowing for the storage of additional information about those elements (such as unique identifiers for nodes, tenant and domain identifiers, subjects name etc)

The graph generated for every CloudTrace record can be summarized by the following domain object:

graph-mock.png

An ACTION (login, logout, start an ECS instance etc) is PERFORMED_BY a SUBJECT (user, agent etc) and is APPLIED_ON a RESOURCE (ECS instance, CCE cluster etc) resulting WITH_STATUS either NORMAL, WARNING or INCIDENT depending on the outcome of this ACTION. The RESOURCE is MEMBER_OF a TENANT which is LOCATED_AT a specific REGION. The central element of this domain model is the ACTION.

Terms in BOLD signify a Node and those in ITALICS signify a Relationship.

Neo4j is widely used in various applications that require efficient analysis and querying of complex networks of data. Examples include social networks, recommendation engines, fraud detection, network and IT operations, and more. It offers a powerful query language called Cypher, specifically designed for working with graph data, enabling users to intuitively and efficiently retrieve and manipulate data within a graph structure.

Usage

Use the clouds.tpl as a template, and fill in a clouds.yaml that contains all the relevant auth information for your connecting to your Open Telekom Cloud Tenant. cts_exporter requires the presence of this file.

clouds:
  otc:
    profile: otc
    auth:
      username: '<USER_NAME>'
      password: '<PASSWORD>'
      ak: '<ACCESS_KEY>'
      sk: '<SECRET_KEY>'
      project_name: 'eu-de_<PROJECT_NAME>
      user_domain_name: 'OTC0000000000xxxxxxxxxx'
      auth_url: 'https://iam.eu-de.otc.t-systems.com:443/v3'
    interface: 'public'
    identity_api_version: 3

Caution

clouds.yaml is already added to .gitignore, so there is no danger leaking its sensitive contents in public!

Additionally, you need to set the following environment variables for cts_exporter:

  • OS_CLOUD the cloud profile you want to choose from your cloud.yaml file
  • OS_DEBUG whether you want to swap to debug mode, defaults to false
  • CTS_TRACKER the CTS tracker you want to hook on, default to system
  • CTS_FROM an integer value in minutes, that signifies how long in the past to look for traces and the interval between two consecutive queries, defaults to 5
  • CTS_X_PNP whether you want to push the collected traces to a sink, defaults to true

Important

There are two additional environment variables, that need to be addressed separately, and those are:

  • K_SINK the URL of the resolved sink
  • K_CE_OVERRIDES a JSON object that specifies overrides to the outbound event

If you choose to deploy cts_exporter as a plain Kubernetes Deployment, for test reasons, using deploy/manifests/cloudtrace-exporter-deployment.yaml you need to explicitly set the value of K_SINK yourself. This will not unfold the whole functionality, because the resource will be deployed outside of the realm of responsibility of Knative reconcilers. As mentioned again, this is exclusively for quick test purposes.

If you deploy cts_exporter as a ContainerSource or SinkBinding, Knative will take care of the rest and inject in your container an environment variable named K_SINK by itself.

For neo4j_sink you need to set the following environment variables:

  • NEO4J_URI the Neo4j connection uri for your instance, defaults to neo4j://localhost:7687
  • NEO4J_USER the username to use for authentication
  • NEO4J_PASSWORD the password to use for authentication

Note

At the moment, the client wrapper around Neo4j driver, built in neo4j_sink, is supporting only Basic Auth.

Deployment

The project is coming with a Makefile that takes care of everything for you, from building (using ko; neither a Dockerfile is needed nor docker registries to push the generated container images) to deployment on a Kubernetes cluster. Only thing you need, if you not working inside the provided Dev Container, is to have a Kubernetes cluster in place, already employed with Knative Serving & Eventing artifacts and a Neo4j database instance, whose endpoints are reachable from your Kubernetes pods.

Before deploying anything, you need to define:

  • the values of cts_exporter environment variables in deploy/manifests/cloudtrace-exporter-configmap.yaml e.g:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: cloudtrace-exporter-config
      namespace: default
    data:
      OS_CLOUD: "otc"
      OS_DEBUG: "false"
      CTS_X_PNP: "true"
      CTS_FROM: "1"
  • the values of neo4j_sink environment variables in deploy/manifests/cloudtrace-neo4j-sink-secrets.yaml e.g:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: cloudtrace-exporter-config
      namespace: default
    data:
      OS_CLOUD: "otc"
      OS_DEBUG: "false"
      CTS_X_PNP: "true"
      CTS_FROM: "1"

Local

Install

Configuration

You can (re)deploy the configuration (ConfigMaps and Secrets) of all workloads using one target:

make install-configuration
Binaries

Note

The targets below will rebuild all the container images from code, redeploy configuration and then deploy our custom exporter and sink.

As mentioned earlier, you are given two options as how to deploy cts_exporter as a Knative workload; either as a ContainerSource:

make install-containersource

or as a SinkBinding:

make install-sinkbinding

Important

neo4j_sink will be deployed as a Knative Service, and its endpoint will serve as the value of K_SINK environment variable that cts_explorer will push the collected CloudEvents to.

Uninstall

make uninstall

Development

Development comes as well with "batteries included". You can either go ahead and start debugging straight on your local machine, or take advantage of the .devcontainer.json file that can be found in the repo, that instructs any IDE that supports Dev Containers, to set up an isolated containerized environment for you with a Neo4j database included.

Local

Working on your plain local host machine (no remote containers), requires the following:

  • Assign values to the environment variables for both binaries, as mentioned earlier in this document
  • Provide a Neo4j database instance. You can choose among a simple container, a Kubernetes workload or even the new Neo4j Desktop
  • Have a Kubernetes cluster, already set up for Knative Serving & Eventing.

Dev Container

Extensions & Features

A Dev Container will be created, with all the necessary prerequisites to get you started developing immediately. A container, based on mcr.microsoft.com/devcontainers/base:jammy will be spawned with the following features pre-installed:

  • Resource Monitor
  • Git, Git Graph
  • Docker in Docker
  • Kubectl, Helm, Helmfile, K9s, KinD, Dive
  • Bridge to Kubernetes Visual Studio Code Extension
  • Latest version of Golang

A postCreateCommand (.devcontainer/setup.sh) will provision:

  • A containerized Kubernetes cluster with 1 control and 3 worker nodes and a private registry, using KinD (cluster manifest is in .devcontainer/cluster.yaml)
  • A standalone Neo4j cluster (you can change that and get a HA cluster by increasing the value of minimumClusterSize in .devcontainer/overrides.yaml)
  • the necessary resources for the Knative Serving & Eventing infrastructure

Ports & Services

You can access Neo4j either internally within the cluster or externally from your container or from your local host.

Internally

If you want to access Neo4j internally from another pod of the cluster, you just need to consume the Kubernetes Service endpoint which in our setup would be neo4j://n4j-cluster.n4j-lb-neo4j.service.cluster.local

Externally

You need, as long as you are working with Visual Studio Code,
to forward the 3 ports (7473, 7474 and 7687) exposed from the n4j-cluster-lb-neo4j Service, so your Neo4j database is accessible from your Dev Container environment.

Tip

You can just port-forward the Kubernetes Service ports straight from K9s, in an integrated Visual Studio Code terminal, and Visual Studio Code will pick up automatically those ports and forward them to your local machine.

devcontainer.png

About

A Knative Eventing custom source to stream events from Open Telekom Cloud CloudTrace service and load them as graph in a Neo4j database.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published