Skip to content

Releases: litmuschaos/litmus

1.4.0

15 May 12:41
80c61b1
Compare
Choose a tag to compare

New Features and Enhancements

  • Introduces the ChaosSchedule CRD & Controller to execute background chaos jobs with a variety of scheduling policies: Immediate, at specific timestamp or between a defined start & end time. Supports both randomized as well as strictly scheduled execution of chaos.

  • Introduces argo-based Chaos Workflows as a means to help users construct complex scenarios around chaos experiments such as ability to parallelize benchmark runs with chaos operations. The initial commits include workflows to gauge impact of pod failures on the performance of a multi-replica nginx deployment.

  • Introduces litmus-go - a repo to hold experiments and chaoslib written in golang, with an alpha litmus-go SDK that has the ability to scaffold go experiments, complete with all artefacts, including the chaosexperiment custom resources. Also introduces litmus-python, which primarily holds chaostoolkit-based chaos experiments.

  • Introduces an alpha Validation Webhook for Litmus to offload experiment dependency validation checks from chaos-operator & chaos-runner components.

  • Adds support for chaos on DeploymentConfig resources on OpenShift

  • Introduces ability to insert user-defined annotations into chaos resources (chaos-runner, experiment pods) via chaosengine

  • Adds support for instance specific metadata (id) definition by users to specify the purpose/track chaos experiment and lend uniqueness to the chaosresult via chaosengine environment variable

  • Refactors the chaos exporter metrics to provide aggregated cluster level chaos metrics with improved naming convention.

  • Introduces a suite of standard observability resources to aid with visualization & monitoring of chaos experiments - including events (heptio eventrouter-prometheus-grafana, metricbeat-elasticsearch-kibana), metrics (chaos-exporter-prometheus-grafana) & logs (promtail-loki-grafana).

  • Homogenizes chaos experiments to use LIB model to invoke chaos injection functions

  • Improves the litmus helm chart to support admin mode installation. Also includes optional install of chaos-exporter.

  • Updates to use stress-ng over stress in chaos libraries to support greater chaos support

  • Adds helm chart testing in CI for litmus-helm repo

  • Updates the litmus-e2e gitlab job scripts to function on on-prem Kubernetes clusters over NAT

  • Shifts to Go Modules for dependency management across litmus components

  • Improves general & troubleshooting FAQs on litmus-docs around failed chaos experiment execution.

Major Bug Fixes

  • Fixes inability to run litmus experiment containers in OpenShift due to “AnsibleError: Unable to create local directories” by generating resource manifests from jinja templates into /tmp.

  • Fixes disk-fill experiment execution on Gravity Kubernetes cluster via dynamic container data path.

  • Fixes exceptions seen in chaos-operator due to lack of resource permissions for replicasets

  • Fixes “unable to update resource” / “operation cannot be fulfilled” transient errors on chaos-operator

  • Fixes broken BDD tests in chaos-runner, chaos-operator CI pipelines

  • Enforces hard stop of pod-delete chaos experiment at total_chaos_duration via chaos timestamp comparisons

  • Fixes algolia-based search functionality in litmus-docs

  • Fixes the analytics count round off issue for operator installation & experiment run count in the charthub

Getting Started

Prerequisites to install

  • Make sure you have a healthy Kubernetes Cluster.
  • Kubernetes 1.12+ is installed

Installation

kubectl apply -f https://litmuschaos.github.io/litmus/litmus-operator-v1.4.0.yaml

Verify your installation

  • Verify if the chaos operator is running
    kubectl get pods -n litmus

  • Verify if chaos CRDs are installed
    kubectl get crds | grep chaos

For more details refer to the documentation at Docs

1.4.0-RC2

13 May 17:27
80c61b1
Compare
Choose a tag to compare
1.4.0-RC2 Pre-release
Pre-release
[Cherry-pick for RC2]  (#1506)

* update(pod-delete): Adding type casting in pod-delete experiment (#1503)

Signed-off-by: shubhamchaudhary <shubham.chaudhary@mayadata.io>

* feat(disk-fill): Adding Dynamic Variable for Container Path (#1502)

* (feat) Adding Dynamic Variable for Container Path

Signed-off-by: Rahul M Chheda <rahul.chheda@mayadata.io>

* Adding appropriate comments

Signed-off-by: Rahul M Chheda <rahul.chheda@mayadata.io>

* feat(image): Using a comman image for stress-ng commands (#1462)

Signed-off-by: Udit Gaurav <uditgaurav@gmail.com>

Co-authored-by: Rahul M Chheda <53308066+rahulchheda@users.noreply.github.com>
Co-authored-by: UDIT GAURAV <35391335+uditgaurav@users.noreply.github.com>

1.4.0-RC1

11 May 16:24
81bc298
Compare
Choose a tag to compare
1.4.0-RC1 Pre-release
Pre-release
update(disk-loss):Adding lib env in disk-loss experiment (#1457)

Signed-off-by: shubhamchaudhary <shubham.chaudhary@mayadata.io>

1.3.0

15 Apr 14:06
4248bdb
Compare
Choose a tag to compare

New Features and Enhancements

  • Introduces admin mode of chaos execution by which all chaos resources can be maintained in a single namespace while injecting chaos on applications across multiple namespaces
  • Introduces helm charts for litmus infrastructure components and chaos charts
  • Supports download of versioned chaos chart bundles on the chaoshub
  • Supports custom/user-specified annotation filters to determine application chaos candidates
  • Makes the chaos exporter a cluster-wide component deployed alongside the operator to extract metrics for all chaosengines
  • Adds increased Kubernetes events to track failures (ex: inability to create chaos resources, access/patch chaosengine etc.,)
  • Adds ability to re-trigger experiments for completed chaosengines via a patch operation
  • Adds OpenEBS NFS provisioner failure experiment with external liveness checks to verify provisioner functionality & data persistence
  • Introduces the Cassandra chaos chart with cassandra node failure experiment along with external liveness checks to perform database CRUD operations during chaos
  • Adds pod level memory hog experiment with provision for users to provide memory to consume (in MB)
  • Enhances the chaostoolkit based pod delete experiment to use python modules with added support for a json (blob) result artifact and different failure modes (i.e., single/multi pod failure)
  • Enhances the node cpu hog experiment to accept cpu core count as a user input
  • Enhances the container kill experiment to repeat chaos actions over a total chaos duration instead of being a single-action test
  • Restructures the chaoslib to categorize chaos injection functions/taskfiles under respective tool-based lib
  • Improves the experiment logs (task banners) based on the category/function performed by the tasks
  • Adds (aquasecurity) trivy based static security scans for all litmus component images as part of respective CI builds
  • Includes lint-checks with custom/project-specific rules for ansible playbooks in litmus CI build
  • Improves the litmus e2e pipelines with addition of new tests around admin mode, multiple parallel chaosengine execution across namespaces, validation for engine status patch
  • Improves e2e infra (scripts) to be able to launch e2e pipelines with custom image versions
  • Adds pipeline history information in the litmus-e2e repo to track experiment status
  • Introduces a new repo to hold charts and experiment icons linked to respective CSVs on the chaoshub.
  • Adds documentation to explain the plugin model in litmus and integration with other chaos tools
  • Adds a new artifact in the litmus repository called releases to track salient resource schema changes and provide references to detailed release notes

Major Bug Fixes

  • Fixes the incorrect experiment status on chaosengine (“Awaited”) despite completion of experiment
  • Fixes failure to schedule auxiliary/helper pods with nodeSelector specification on EKS clusters
  • Fixes ambiguity/missing steps in developer guide and updates experiment artefacts templates with latest changes (since v1.0)
  • Fixes the event source names in case of events generated by chaos-runners to bear chaos-runner pod name
  • Fixes the failure to verify successful app reschedule post drain operation in node-drain experiment
  • Fixes the crash of powerfulseal deployment due to use of improper service account

Getting Started

Prerequisites to install

  • Make sure you have a healthy Kubernetes Cluster.
  • Kubernetes 1.11+ is installed

Installation

kubectl apply -f https://litmuschaos.github.io/litmus/litmus-operator-v1.3.0.yaml

Verify your installation

  • Verify if the chaos operator is running
    kubectl get pods -n litmus

  • Verify if chaos CRDs are installed
    kubectl get crds | grep chaos

For more details refer to the documentation at Docs

1.3.0-RC1

11 Apr 15:52
ec3bbb5
Compare
Choose a tag to compare
1.3.0-RC1 Pre-release
Pre-release
(fix): Typo in chaoslib path of openebs pool pod failure experiment (…

…#1419)

Signed-off-by: Udit Gaurav <uditgaurav@gmail.com>

1.2.2

01 Apr 14:57
a77e8be
Compare
Choose a tag to compare
(Review): Adding OpenEBS NFS provisioner kill experiment (#1248) (#1397)

* (feat): Adding OpenEBS NFS provisioner kill experiment

Signed-off-by: Raj <raj.das@mayadata.io>

1.2.1

16 Mar 17:13
0a96ea3
Compare
Choose a tag to compare
Merge pull request #1334 from ksatchit/v1.2.1

[Cherry-pick for 1.2.1]

1.2.0

14 Mar 09:13
0b909c8
Compare
Choose a tag to compare

New features and Enhancements

  • Addition of Chaos Events (across all litmus components, i.e., operator/runner/experiment job) to indicate experiment lifecycle
  • Enhanced ChaosResult with experiment failure reason (step) provided in CR status
  • Includes Node Memory Hog experiment to generic/kubernetes suite
  • Includes OpenEBS pool disk loss experiment for GKE/AWS
  • Adds support for Amazon EKS platform for generic chaos experiments
  • Introduces a new chart category based on chaostoolkit with initial pod chaos experiments
  • Supports override of default runner properties such as imagePullPolicy & entrypoint/args
  • Extends cleanupPolicy enforcement to chaos-runner pods (apart from just the experiment job) with improved reconciliation flow
  • Improves experiment chaoslib which now makes use of jobs (replacing daemonsets) to reduce the number of chaos resources (pods) used in an experiment, with chaos injection commands burned into the job templates.
  • Adds support for RAMP_UP / RAMP_DOWN periods during the course of a chaos experiment.
  • Homogenizes the time units (sec over msec) used across experiments for chaos duration and other parameters.
  • Improved e2e suite with Ginkgo based BDD tests for newly added experiments and operator functionality
  • Refactors the test-tools repository structure based on tool type
  • Introduces an NFS liveness tool to lay foundation for NFS storage chaos experiments
  • Adds governance artefacts (Maintainers, Governance) along with the project roadmap and an initial set of public adopters of LitmusChaos
  • Adds license dependencies and scan reports obtained via fossa

Major Bug Fixes

  • Fixes the hardcoded total chaos/job wait duration in the node-cpu-hog experiment.
  • Fixes to verify state of application pods (health check) before proceeding with subsequent iterations of pod-delete chaos
  • Adds a unique instance_id/run_id (hash) to names & labels of chaos jobs started by the experiment to aid identification and prevent conflicts upon parallel or repeated runs in a given namespace.
  • Fixes execution workflow of chaos experiments when run as a standalone job without orchestration by the chaos operator

Getting Started

Prerequisites to install

  • Make sure you have a healthy Kubernetes Cluster.
  • Kubernetes 1.11+ is installed

Installation

kubectl apply -f https://litmuschaos.github.io/litmus/litmus-operator-v1.2.0.yaml

Verify your installation

  • Verify if the chaos operator is running
    kubectl get pods -n litmus

  • Verify if chaos CRDs are installed
    kubectl get crds | grep chaos

For more details refer to the documentation at Docs

1.2.0-RC1

10 Mar 12:22
1a5b668
Compare
Choose a tag to compare
1.2.0-RC1 Pre-release
Pre-release
(chore)roadmap: add issue links to near term roadmap items (#1285)

* (chore)roadmap: add issue links to near term roadmap items

Signed-off-by: ksatchit <ksatchit@mayadata.io>
Co-authored-by: Shubham Chaudhary <ashubham314@gmail.com>

1.1.1

28 Feb 15:26
4a23e04
Compare
Choose a tag to compare
Merge pull request #1236 from ksatchit/v1.1.1

[Cherry-pick for v1.1.1 patch release]